Working on More Advanced Version of Autocat Using a Sentence Transformer ML Model

Wanted to say hi and see if anyone has been working on something similar.

expense categories

Was wondering if there was something better than rules-based categorisation of expenses and have been quite happy with my early results.

1 Like

Interesting, @f.strauf. Can you share more about how it works? What roadblocks you are encountering? If you plan to share it in the community? And where you are headed with this?

Yeah Absolutely.

It’s using a sentence transformer machine learning model, which runs on Replicate. This allows you to detect the meaning of sentences i.e. the meaning of your expense text. Using the meaning, you can match with sentences that mean the same thing and then map already applied categories to new expenses.

Right now it’s probably more something for technical people (i.e. you need to be able to code), but I’m hoping to turn this into a little Google Sheets Extension eventually.

I’m happy to share more details for the technical audience out here - mind you this goes beyond just Google Sheets script and will take a bit of work to set up.

I’ve written about it in more detail in a couple of blog posts, but I don’t think it’s ok to share them here.

The environment is partly Google Apps Script and partly Python, deployed on Google Run.

It currently works well for me, so the following steps are making it usable by others and performance tuning the dataset it is trained with (i.e. reducing the amount of duplicate trained expenses).

Any idea if you guys from Tiller looked into this as well?

Any idea if you guys from Tiller looked into this as well?

We are starting to experiment with AI/ML+categorization and reporting. We are curious to see where your experiments lead. Please share what you can (and if you’re looking to deploy a proprietary solution we respect your discretion).

I’ve written about it in more detail in a couple of blog posts, but I don’t think it’s ok to share them here.

I know there is a lot of interest in this workflow so feel free to share links to your blog posts so long as they are relevant to Tiller and personal-finance categorization.

Good luck!

Sounds great. Would love to hear what solution you guys come up with too!

Here is a bunch more detail on the whole build so far.

https ://florianstrauf.substack.com/p/a-google-sheet-addon-that-classifies

I’m just going through the process of publishing the app prototype on Google Marketplace so people can try it out themselves.

Happy to share it here if anyone is interested.

1 Like

Thanks for sharing the blog entry, @f.strauf. It’s a great write up. I found the Next Steps as you try to productize a working prototype particularly interesting.

Have you thought about how you’d tell the model that it made a mistake? Do you just periodically retrain it? Do you create an interface that elevates on recent transactions that have been recategorized by the user?

I see the process as:

  1. train the model
  2. run categorising and let the model make suggestions.
  3. train the model again.

Most people will probably only do this once a month (would love to hear what people here think), so the computing won’t be too high.

I’ve built the thing for myself so far, and it didn’t seem too much effort to make it available to others.

1 Like

What I found interesting is that when I experimented with AI categorization, it was very effective because I have many categories.
From Tiller’s perspective, I think the AI should be server side and then as an optional category similar to what we have now with the Yodlee field, “Category Hint.”

1 Like

Yeah, you could make it run completely in the background and only suggest categories based on super high confidence predictions.

I’ve since moved from a Google Sheet app to a web app that accesses my Google Sheet and does the categorisation.