Padding and Unpadding CSV Files

When viewing comma-delimited text files or CSV files in a text editor, the columns won’t align unless the fields were padded with spaces to make their widths equal. Other applications may treat any whitespace as significant, requiring an unpadded CSV file. With PowerGREP, you can easily pad and unpad CSV files.

Padding CSV Files

To pad CSV files, we’ll search using a regular expression that matches one CSV field. As the replacement, we’ll re-insert the regex match padded with spaces to a specific length. Padding is easy in PowerGREP with match placeholders.

  1. Select the files you want to pad in the File Selector.
  2. Start with a fresh action.
  3. Set the action type to “search and replace”. Leave the search type as “regular expression”.
  4. Enter the regular expression [ \t]*+("[^"\r\n]*+"[ \t]*+|[^,\r\n]*+). This regular expression matches a single field in a CSV file. The field can be enclosed in double quotes, and extra whitespace is allowed before and after the quotes.
  5. Enter %GROUP1:40L% as the replacement text. This match placeholder inserts the text matched by the first (and only) capturing group in the regular expression, with as many spaces added at the left side to make it at least 40 character long. Fields that are longer to begin with will not be truncated. We’re using the capturing group in the regular expression to discard any padding that may already be present in the file, so everything gets padded to 40 characters.
  6. Set the target and backup file options as you like them.
  7. Click the Preview button to run a test.
  8. If all looks well, click the Replace button to actually pad the CSV files.

You can find this action in the PowerGREP5.pgl standard library as “CSV: Pad fields”.

Unpadding CSV Files

To remove the padding from the CSV file, you can use the exact same action. Only the replacement text will be different. The regular expression uses a capturing group to store the actual text of the CSV field, separate from the whitespace at the start of the field that the overall regular expression also matches. So to remove the padding, all we need to do is to replace the overall regex match with the text matched by the capturing group. You can do this with the placeholder %GROUP1% or simply the backreference $1.

You can find this action in the PowerGREP5.pgl standard library as “CSV: Unpad fields”.