SafeChars plugin for Take Command / TCC / 4NT

Version 1.9.3     2016-08-16

Charles Dye

Purpose:

This plugin provides functions for reading and writing text which may contain characters with special meaning to TCC (“dangerous” characters.)

This plugin requires Take Command / TCC / 4NT version 8 or better. Older versions are not supported.

Installation:

To use this plugin, copy the files SafeChars.dll and SafeChars.chm to some known location on your hard drive. (If you are using the x64 version of Take Command, take SafeChars-x64.dll instead of SafeChars.dll.) Load the plugin with a PLUGIN /L command. For example:

plugin /l c:\bin\tcmd\safechars\safechars.dll

If you copy the .DLL to a subdirectory named PlugIns within your Take Command program directory, the plugin will be loaded automatically by each new instance of TCC.

Theory:

This plugin takes advantage of Unicode’s “Halfwidth and Fullwidth Forms” block, specifically the characters at U+FF01 through U+FF5E. These characters correspond to ASCII characters 0x20 through 0x7E, and when redirected to a file with //UnicodeOutput=No, they will even be automatically translated back to their ASCII equivalents. But TCC doesn’t assign any significance to these remapped “safe” characters, so you can safely use them in a batch file and they will be handled like ordinary text characters.

Depending on your font, whether TCC is running in a standalone console window or within a Take Command tab, the phase of the moon and general Windows witchiness, the remapped “safe” characters may or may not appear the same on-screen as their ASCII equivalents. If you see empty boxes instead of normal-looking characters, try a different font. The default “Raster Fonts” works well for me, although it’s not terribly attractive.

By default, the characters defined as “dangerous” by this plugin are:

Character ASCIIRemapped to
double quotes"34 / 0x22U+FF02
percent sign%37 / 0x25U+FF05
ampersand&38 / 0x26U+FF06
open parenthesis(40 / 0x28U+FF08
close parenthesis)41 / 0x29U+FF09
less-than sign<60 / 0x3CU+FF1C
greater-than sign>62 / 0x3EU+FF1E
open bracket[91 / 0x5BU+FF3B
close bracket]93 / 0x5DU+FF3D
caret^94 / 0x5EU+FF3E
grave accent / backquote`96 / 0x60U+FF40
vertical bar¦124 / 0x7CU+FF5C

You can customize this list, adding or removing unsafe characters, with the UNSAFE command.

Note that this list includes the default command separator and the default escape character. If you use non-default settings for these characters, I suggest changing back to the defaults at least temporarily while using this plugin’s features:

setlocal
setdos /c38 /e94 /p36
...

endlocal

As mentioned above, when //UnicodeOutput=No you can redirect these “safe” characters and they will automatically be replaced with their ASCII equivalents. This convenient fix won’t happen when //UnicodeOutput=Yes, though; you’ll get strange characters in the output file. To write Unicode to a file, converting the remapped “safe” characters back to their original values, a @SAFEWRITE function is provided.

No provision is made for remapping or handling the ASCII NUL, character 0.

Syntax Note:

The syntax definitions in the following text use these conventions for clarity:

BOLD CODEindicates text which must be typed exactly as shown.
CODEindicates optional text, which may be typed as shown or omitted.
Bold italicnames a required argument; a value must be supplied.
Regular italicnames an optional argument.
ellipsis…after an argument means that more than one may be given.

Plugin Features:

New commands: SAFEARRAY, SAFECHARSHELP, SAFEECHO, SAFEECHOS, UNSAFE

New functions: @COUNTSAFE, @COUNTSAFEX, @COUNTUNSAFE, @COUNTUNSAFEX, @SAFECHARSINFO, @SAFECLIP, @SAFECLIPW, @SAFEENV, @SAFEEXP, @SAFELINE, @SAFEREAD, @SAFEWRITE, @UNSAFE, @UNSAFEESC

New variables: _AMP, _BQ, _CARET, _CLOSEBRK, _CLOSEPAT, _GT, _LT, _OPENBRK, _OPENPAT, _PCT, _QUOTE, _VBAR

Commands:


SAFEARRAY —Replaces dangerous characters or quotes elements in an array.

Syntax:
SAFEARRAY /C:n /R:n /Q /QE arrayname

/C:nthe column to change; 0-based, defaults to all
/R:nthe row to change; 0-based, defaults to all
/QQuote non-empty elements instead of replacing characters
/QEQuote all elements, even empty
arraynamethe array to modify

The arrayname is required. The array must already exist; it must have one or two dimensions. If /C:n is not specified, all columns will be affected; if /R:n is not specified, all rows will be affected. The default action is to remap any “dangerous” characters in array elements. If /Q is specified, each array element will instead be enclosed in double quotes unless it is empty. If /QE is used, all array elements will be double-quotes, even if they are empty.

This command is not available in TCC/LE or in versions prior to v10.



SAFECHARSHELP — Opens this plugin’s help file.

Syntax:
SAFECHARSHELP topic

topicthe page to display

The SAFECHARSHELP command will locate and open this plugin’s help file. In most cases, the internal HELP command, and the F1 and Ctrl-F1 keys, will be more convenient. The sole advantage to this command is that it can be used to open the help file to any desired topic, not only to the names of commands, functions, and variables.



SAFEECHO — Displays text to standard output, restoring any remapped “safe” characters to their original, possibly dangerous values.

Syntax:
SAFEECHO text

textthe string to display

A carriage return and line feed will be written to the file after text. The text may or may not be written in Unicode, depending on the value of the //UnicodeOutput option.

Note: All characters in the range U+FF00 through U+FF5F will be replaced with their normal, ASCII equivalents.

Note: When //UnicodeOutput=No (the default setting), this command offers no benefit over ECHO.



SAFEECHOS — Displays text to standard output, restoring any remapped “safe” characters to their original, possibly dangerous values. A carriage return and line feed will not be appended.

Syntax:
SAFEECHOS text

textthe string to display

The text may or may not be written in Unicode, depending on the current value of //UnicodeOutput.

Note: All characters in the range U+FF00 through U+FF5F will be replaced with their normal, ASCII equivalents.

Note: When //UnicodeOutput=No (the default setting), this command offers no benefit over ECHOS.



UNSAFE — Enables, disables, saves, restores, or lists dangerous characters.

Syntax:
UNSAFE /D:chars /E:chars /R /S /Z

/D:charsdisable (treat as safe) the following characters
/E:charsenable (treat as dangerous) the following characters
/Rrestore the list of dangerous characters from the registry
/Ssave the list of dangerous characters to the registry
/Zreset the list of dangerous characters to the plugin defaults

This command allows you to customize the list of “dangerous” characters. You can enable more characters, so the plugin will treat them as dangerous and remap them to Unicode alternates, or disable characters so they are not considered dangerous and will not be remapped. Only characters in the range of 32 - 127 can be remapped. You can also save your modified list of dangerous characters to the registry; it will then be reloaded automatically when the plugin starts.

The /D:chars and /E:chars options disable and enable characters. chars may be specified as: a single character; a decimal value from 32 to 127, or a hexadecimal value from 0x20 to 0x7F; a character range specified as two characters/values separated by a minus sign; or a list of two or more characters/ranges separated by commas or semicolons.

/S saves the current settings, making them the default when the plugin is loaded in the future. This data is saved in the registry, in a key named HKEY_CURRENT_USER\SOFTWARE\JPPlugins\SafeChars. /R reloads the settings from the registry, as if the plugin were unloaded and reloaded. /Z resets the dangerous characters to the plugin’s defaults.

For example, to include the comma as a dangerous character:

unsafe /e:,

To include the comma and make this setting the default in the future:

unsafe /e:, /s

Most options can be combined (you cannot combine /R and /Z); combined options will be processed in a reasonable order. A /R or /Z is handled before any /D or /E; a /S is handled after all other options. If you do not specify any options at all, UNSAFE will display the current list of dangerous characters.



Plugin Functions:


@COUNTSAFE — Returns the number of “safe” characters in a string.

Syntax:
%@COUNTSAFE[text]

text:the string to examine

This function counts the number of characters in the range U+FF00 through U+FF5F in the text. Whether these were “dangerous” characters remapped by this plugin, or whether they were present to begin with, is unknown.



@COUNTSAFEX — Attempts to expand an internal variable, function, or array element, and then returns the number of “safe” characters in the resulting string.

Syntax:
%@COUNTSAFEX[text]

text:the variable, function, or array element to expand

This function performs variable expansion on the variable, function, or array element named by text, and then counts the number of characters in the range U+FF00 through U+FF5F in the resulting string. Whether these were “dangerous” characters remapped by this plugin, or whether they were present to begin with, is unknown.

Note: Do not type a percent sign before the variable or function name. If you do, TCC’s parser will expand the argument before @COUNTSAFEX gets to see it.



@COUNTUNSAFE — Returns the number of “dangerous” characters in a string.

Syntax:
%@COUNTUNSAFE[text]

text:the string to examine

This function counts the number of “dangerous” characters, as defined by the UNSAFE command, in the input text.



@COUNTUNSAFEX — Attempts to expand an internal variable, function, or array element, and then returns the number of “dangerous” characters in the resulting string.

Syntax:
%@COUNTUNSAFEX[text]

text:the variable, function, or array element to expand

This function performs variable expansion on the variable, function, or array element named by text, and then counts the number of “dangerous” characters, as defined by the UNSAFE command, in the resulting string.

Note: Do not type a percent sign before the variable or function name. If you do, TCC’s parser will expand the argument before @COUNTUNSAFEX gets to see it.



@SAFECHARSINFO — Returns internal plugin info.

Syntax:
%@SAFECHARSINFO[n,text]

nselects the plugin info to return:
     0: the plugin version number; parts separated by periods
     1: a 12-digit hex (i.e. 96-bit) bitmap of current “unsafe” characters
     2: return text made safe according to the current settings

This function is intended for use by other plugins. Most users will never need it; it is of little or no use in batch files or at the command line.



@SAFECLIP — Returns a line from the clipboard, replacing any dangerous characters with “safe” equivalents.

Syntax:
%@SAFECLIP[n]

n:line numbering starts at 0

This function calls TCC’s built-in @CLIP function and massages whatever text it returns. See the Take Command documentation for more information on @CLIP.

If you attempt to read past the end of the clipboard, @SAFECLIP will return the value **EOC**.



@SAFECLIPW — Writes text to the clipboard, restoring any remapped “safe” characters to their original, possibly dangerous values.

Syntax:
%@SAFECLIPW[text]

text:the string to save on the clipboard


@SAFEENV — Returns the value of an environment variable, replacing any dangerous characters with “safe” equivalents.

Syntax:
%@SAFEENV[var]

var:the name of an environment variable
input Type some text: %%text
set text=%@safeenv[text]
echo You typed "%text".
 

This function is useful with commands such as INPUT or DO var IN @file, which stash text directly in an environment variable without first passing it through the command parser. Do not type a percent sign before the variable name. If you do, TCC’s parser will expand the variable before @SAFEENV ever sees it.

Note: This function only reads text from environment variables, not from internal variables, functions, or array elements. For anything other than an environment variable, use @SAFEEXP instead.

Note: If you want to use this function in conjunction with the FOR command, be aware that FOR only stores its control variable in the environment if the variable name is more than one character long.



@SAFEEXP — Attempts to expand an internal variable, function, or array element, and then remaps any dangerous characters to “safe” alternatives.

Syntax:
%@SAFEEXP[text]

text:the variable, function, or array element to expand
echo The current command separator is %@safeexp[=]
echo %@safeexp[@ftype[%@assoc[.btm]]]

Note: Do not type a percent sign before the variable or function name. If you do, TCC’s parser will expand the argument before @SAFEEXP gets to see it.



@SAFELINE — Reads the specified line from a file, mapping any dangerous characters to safe alternatives.

Syntax:
%@SAFELINE[filename,n]

filename:the file to read; quote it if it contains spaces
n:line numbering starts at 0

This function calls TCC’s built-in @LINE function and massages whatever text it returns. See the Take Command documentation for more information on @LINE.

If you attempt to read past the end of the file, @SAFELINE will return the value **EOF**.



@SAFEREAD — Reads text from a file, mapping any dangerous characters to safe alternatives.

Syntax:
%@SAFEREAD[handle,length]

handle:a file handle opened with @FILEOPEN
length:the number of bytes to read; if not specified, one line is read

This function calls TCC’s built-in @FILEREAD function and massages whatever text it returns. See the Take Command documentation for more information on @FILEREAD.

If you attempt to read past the end of the file, @SAFEREAD will return the value **EOF**.



@SAFEWRITE — Writes a line of text to a file, replacing any remapped “safe” characters with their original, possibly dangerous, values.

Syntax:
%@SAFEWRITE[handle,text]

handle:a file handle opened with @FILEOPEN, or - for stdout
text:a line of text to write to the file

A carriage return and line feed will be written to the file after text. The text may or may not be written in Unicode, depending on the value of the //UnicodeOutput option. If //UnicodeOutput=Yes and the output file is empty, a Byte Order Mark (BOM) will be written before the text.

This function returns the number of characters written to the file, not counting any BOM. If an error was detected, -1 will be returned.

Note: All characters in the range U+FF00 through U+FF5F will be replaced with their normal, ASCII equivalents. (If we’re going to mangle Japanese text files, let’s at least mangle them consistently….)

Note: When //UnicodeOutput=No (the default setting), this function is unnecessary. @FILEWRITE or output redirection will work just as well.



@UNSAFE — Replaces “safe” characters in a string with potentially dangerous ones.

Syntax:
%@UNSAFE[text]

textText, possibly containing “safe” characters

I do not recommend the use of this function. The returned value may contain ampersands, pipes, redirection operators, or who-knows-what. If you decide that you must use it, I strongly recommend wrapping it in double quotes to limit the carnage.

set test="%@unsafe[%test]"


Note: All characters in the range U+FF00 through U+FF5F will be remapped to ASCII equivalents.



@UNSAFEESC — Replaces “safe” characters in a string with potentially dangerous ones, and escapes them.

Syntax:
%@UNSAFEESC[text]

textText, possibly containing “safe” characters

Any characters currently defined as unsafe will be escaped. Percent signs will be doubled; double-quote and backquote marks will be replaced with %=q and %=k; and any other unsafe characters will have a %= prefixed. Any characters not currently deemed unsafe will not be escaped.

echo Line = %@unsafeesc[%line]

Note: All characters in the range U+FF00 through U+FF5F will be remapped to ASCII equivalents.



Internal Variables:

These variables return the remapped, “safe” values for various characters. They can be used to search for or replace the remapped characters in functions such as @INDEX and @REPLACE. If the //UnicodeOutput option is set to NO, then these remapped characters can also be redirected to a file, and TCC’s built-in Unicode-to-ANSI translation will automagically replace them with the “dangerous” forms.

VariableReturns 
%_AMPsafe ampersand&
%_BQsafe backquote`
%_CARETsafe caret^
%_CLOSEBRKclose bracket]
%_CLOSEPATclose parenthesis)
%_GTsafe greater-than sign>
%_LTsafe less-than sign<
%_OPENBRKopen bracket[
%_OPENPATopen parenthesis(
%_PCTsafe percent sign%
%_QUOTEsafe double quotes"
%_VBARsafe vertical bar¦

Size Limits:

All strings in this plugin are limited to 16,383 characters. Attempting to pass or return a string longer than 16,383 characters may result in an error message, silent truncation, or possibly even a crash.

Startup Message:

This plugin displays an informational line when it initializes. The message will be suppressed in transient or pipe shells. You can disable it for all shells by defining an environment variable named NOLOADMSG, for example:

set /e /u noloadmsg=1

Changes:

VersionDateChanges
1.3.02010-10-17.CHM help file
1.4.02011-01-03Treat parentheses as dangerous, because they affect command parsing and redirection; added _OPENPAT and _CLOSEPAT. Treat square brackets as dangerous; added _OPENBRK and _CLOSEBRK. Minor tweak to @SAFEWRITE to not require a comma between args if the second argument is omitted.
1.4.12011-02-26Tweaks to the help system; now it’s single-instance. Added SAFECHARSHELP.
1.4.22011-03-03Further tweaks to the help system.
1.4.32011-03-28Don’t call HtmlHelp() to close the help file on shutdown; this is insanely slow for some reason.
1.5.02011-06-17Adds the UNSAFE command, which allows the user to customize the list of “dangerous” characters.
1.5.12011-07-29Adds new function @UQUOTES.
1.5.22011-07-31Tweaked @UQUOTES. Double quotes are now converted according to surrounding characters, not just a simple toggle.
1.5.32011-08-05Further tweaks to @UQUOTES.
1.5.42011-09-29Don’t strip leading whitespace from args to SAFEECHO / SAFEECHOS.
1.5.52011-10-08@UQUOTES now converts double hyphens to em dashes; made a trivial change to @SAFEENV and @SAFEEXP. Tweaked style sheets and made other minor changes to the help.
1.5.62011-10-27Internal tweaks to the code, and minor changes to the documentation style sheets. No new features or bug fixes.
1.5.72011-11-14Minor tweak: @UQUOTES now recognizes two-digit year abbreviations.
1.6.02012-01-19Added SAFEARRAY command to massage array elements en masse.
1.6.12012-01-19Added @UNSAFE function.
1.6.22012-01-21Added @COUNTSAFE, @COUNTSAFEX, @COUNTUNSAFE, and @COUNTUNSAFEX.
1.6.32012-02-19Only strip leading whitespace from args to SAFEECHO / SAFEECHOS when checking for /?
1.6.42012-02-21Close help window when plugin is unloaded.
1.6.52012-03-13Minor tweaks to @UQUOTES, and documented its control variables.
1.6.62012-04-16Adds a kludge to @UQUOTES to support ‘okinas.
1.7.02012-09-04Adds @SAFECHARSINFO function (for use by other plugins).
1.7.12013-04-23Updated the plugin’s web address.
1.7.22014-02-19Minor tweaks to the help system.
1.8.02014-06-30Added @UNSAFEESC function.
1.9.02014-10-29Changes for TCC v17 compatibility. De-documented @UQUOTES — it’s also implemented in TextUtils, and I plan to remove it from a future version of this plugin.
1.9.12014-10-31Bug fix: CallInternal() adding an asterisk when it shouldn’t.
1.9.22016-03-09Reworks the help system, for systems where the documented HtmlHelp() API is broken.
1.9.32016-08-16Fix for the plugin not loading in TCC v20.

Status and Licensing:

This software is copyright © 2016, Charles Dye.

The binaries and documentation for this plugin may be freely redistributed without restriction. I make no guarantee and give no warranty for its operation. If you find a problem, you can report it in the JP Software support forum.

Download:

You can download the current version of the plugin from http://prospero.unm.edu/dl/safechars.zip or ftp://prospero.unm.edu/safechars.zip.