Normalizing csv Import and AutoCat

When importing historical data, the description field of the file downloaded from my bank (Wells Fargo) is ALL CAPS sometimes but not others. Because AutoCat is case-sensitive, this creates problems with the AutoCat feature (which I think is one of Tiller’s major benefits).

Modifying the downloaded description column using Google Sheets Lower() function won’t work because some of the text strings that I’m using to AutoCat start with a capital letter.

This means adding an Upper Case and lower case version of the matching criteria in AutoCat for a lot of items and I’m hoping there’s a way around this.

I’m sure there’s a reason that the matching criteria in AutoCat needs to be case sensitive but has anyone found a way to get around this issue?

ScottC
Saluda, NC

I think you can handle this using a Regex column. On your AutoCat sheet add a column called “Full Description Regex” or “Description Regex” depending on which you want to run the rule against. You can then use a Regex rule to be very specific about what you’re looking for. I’m not an expert on Regex, but I’ve been using a rule that looks for any Full Description that begins with “PAYPAL”:
^(?!PAYPAL).*$
To look for a phrase but be case insensitive I think you would use this:
(?i)(?:PAYPAL)

Here’s an excellent tool for helping to figure out your Regex rule: https://regex101.com/

Thank you. I’ll give that a try. Personally, I think Regex is the very definition of “counter-intuitive.” I know it’s extremely powerful but the syntax is difficult – to say the least.

Thanks for the link to the tool.

ScottC

To avoid a conflict, when I add a rule in the new column (Description Regex), I assume I need to clear the previous rule in the column, “Description Contains.” Correct?

Also, can I have some rules in the Description Contains column and others in the new column Description Regex as long as there aren’t 2 rules in the same row?

I think this is going to be very helpful.

Thanks again.

ScottC

Yeah, I hear ya, Regex is no fun, but it does come in handy every now and then. I would definitely clear out the other rule in the same row, two rules trying to work on the same field is likely going to cause pain. Yea, you can have multiple rules, as long as they are in different rows. Once it matches a rule it stops looking, so rules further down will have no impact (so prioritize your rules with more important towards the top).

Joseph:

I thought I’d share what I’ve done so far with RegexMatch to allow AutoCat to categorize transactions without regard for case in the Description field from the csv file.

I’ve tried two methods.

  • Method 1: The syntax you suggested yesterday that includes the literal search string in the RegexMatch formula. (with the Regex expression wrapped in quotes.)
  • Method 2: The Regex formula references another cell with the match value. I thought this might be an easier way to test and modify matching rule.

RESULTS

Method 1: Insert Match String into Regex Formula | | | | |

  • | - | - | - | - |
    Description | Formula | Evaluation | Result is | Notes |
    PURCHASE RETURN AUTHORIZED ON 08/29 PAYPAL *DOLCE VITA 718-308-2314 NY | =regexmatch(a3,“(?i)(?:Paypal)”) | TRUE | Correct. | “Paypal” is in cell A3 |

=regexmatch(a4,“(?i)(?:Paypal)”) | FALSE | Correct. | Cell A4 is empty so “Paypal” is not present |

Method 1 works but it requires editing each formula with the match string. | | | | |

Method 2: Reference Match string from another cell | | | | |
Description | Formula | Match Value | Evaluation | Result Is |
PURCHASE RETURN AUTHORIZED ON 08/29 PAYPAL *DOLCE VITA 718-308-2314 NY | =regexmatch(A9,“(?i)(?:”& c$9 & “)”) | vita | TRUE | Correct |
PURCHASE RETURN AUTHORIZED ON 08/29 PAYPAL *DOLCE VITA 718-308-2314 NY | =regexmatch(A10,“(?i)(?:”& c$10 & “)”) | cat | FALSE | Correct. “cat” is not in A10 |
PURCHASE RETURN AUTHORIZED ON 08/29 PAYPAL *DOLCE VITA 718-308-2314 NY | =regexmatch(A11,“(?i)(?:”& c$11 & “)”) | dog | FALSE | Correct. “dog” is not in A11 |
PURCHASE RETURN AUTHORIZED ON 08/29 PAYPAL DOLCE VITA 718-308-2314 NY | =regexmatch(A12,“(?i)(?:”& c$12 & “)”) | dolce | TRUE | Correct |
PURCHASE AUTHORIZED ON 08/31 WM SUPERC Wal-Mart Sup | =regexmatch(A13,“(?i)(?:”& c$13 & “)”) | wal-mart | TRUE | wal-mart |
ATM WITHDRAWAL AUTHORIZED ON 08/31 301 S MAIN ST HENDERSONVILL NC | =regexmatch(A14,“(?i)(?:”& c$14 & “)”) | 301 S Main | TRUE | 301 S Main |
PURCHASE AUTHORIZED ON 08/29 AMZN Mktp US
AF4MD Amzn.com/ | =regexmatch(A15,“(?i)(?:”& c$15 & “)”) | | TRUE | Incorrect. C15 is empty |

Method 2 works EXCEPT when the cell with the match value is empty. For some reason, it evaluates as True. See red-filled cells.

Link to Google Sheet.

Do you know how to modify the Regex expression to return False if the cell containing the Match Value is empty?

Regards,

ScottC
Saluda, NC

Don’t have access to the shared link. If you’re already using a formula successfully, you should just be able to do something like this:
=IF(ISBLANK(A2),"",regexmatch(A2,“(?i)(?:”& c$2 & “)”)
Which just says, if A2 is blank, do nothing, if it’s not blank, then do this regexmatch function.