Introducing a Simple Module for Parsing CSV Files

ยท

5 min read

image

Today I want to introduce a simple module for parsing CSV files

Recently I was exploring my old repository: github.com/Food-Static-Data/food-datasets-c..

Inside I have a cool set of small modules, that helped me a lot. As my code is highly tied to those packages -> I need to pray for developers that build them, so I don't need to spend precious time.

List of modules that I'm using:

  • csv-parser
  • fs
  • lodash
  • path
  • path-exists
  • shelljs

Why did I create this package? It's simple. During our work @ Groceristar, we came around a number of databases and datasets, related to "food-tech". To be able to extract that data and play with it -> you need to parse CSV files.

image

image

image

image

Link to the repository: github.com/Food-Static-Data/food-datasets-c..

Link to npm page: npmjs.com/package/@groceristar/food-dataset..

I will also post updates about building modules for static data on indie hackers. While it didn't help with promotions a lot, founders are pretty positive people and their support really matters. Here is an org that I created few years ago: indiehackers.com/product/food-static-data

As usually, experienced developers might tell me that I'm stupid and CSV parsing is a mundane procedure. But I don't care. I realized that for a few separate projects we are running similar code. So I decided to isolate it.

I did it a few times before I finally find a way to make it work as I like. And you can see how it looks right now.

I can say, not ideal, but it was working fine for me. Right now I plan to revamp this package a little bit, in order to make it work with the latest versions of rollupjs and babel.

image

image

image

image

image

image

While the idea is simple: connecting a dataset in CSV format, parsing it, and exporting data as you need it, while you need to make it work with 10 different datasets, things arent as easy as they sound in your head.

CSVs not only related to food tech datasets. But for me was important to be able to use different datasets and easy to play with it. It makes other modules that we are building data-agnostic and more independent to a database/frameworks/logic. Basically, around this idea, we created and optimized like 13 repositories. Recently I created a separate organization that will be focused on those repositories only.

Link: github.com/Food-Static-Data

Later I plan to remove some repositories when they wouldn't be replaced by other, more popular, and stable tools. This current module can be useful for parsing other datasets too. But making it separate from the food tech topic isn't my task at this point.

And I was able to include and implement cool and important packages, like husky and coveralls. I can't say that I get most from them, but at the same time, it helped me to jump into the "open source ocean" that related to the GitHub rabbit hole that I'm still exploring for so many years.

and it's good to not just type another line of code, but also be able to see that your codebase is solid and nothing breaking it behind your back

image

image

CodeClimate(codeclimate.com) helped me to explore and be able to take another look at how to develop things.

image

image

image

Yeah, codeclimate shows that I have code duplicates and ~50h of tech debt. Looks like a lot, right? But this is a small independent package. Imagine how much tech debt has your huge monolith project. Years of wasted time of developers, probably ๐Ÿ™‚

At some point i'll remove duplicates and it will reduce number of hours on this page.

Plus, usually, your product owner or CTO is busy and can't review code and be able to track things inside your code. CodeClimate can do some stuff for you. Just check those settings. Plus, they support open-source movement. So if your code is open and located on GitHub, you can use it for free.

image

Stretch goals are simple

  • I want to invest some time into CI/CD stuff. At this moment i'll pick Travic CI. At some point i'll extend it, so when a new version of this package is published, i'll run it against our datasets and will see if something breaks or not.
  • I also need to remove duplicated code that i was moved into separated packages but still present here, due to back capability.
  • and it's also not cool to see JS code with all there csv files at the same repository. I need to came with idea about how to organize folders and make it easy to navigate. While it works for me - other people might find it very confusing.

We even did a great readme file with an explanation of how to run this package

image

image

image

image

We collected a great number of datasets that can help a wast number of food projects. Some of them even sell the latest updates for money. Right now this module was tested with:

  • FoodComposition dataset
  • USDA dataset(i pick 4 major tables with data)
  • FAO(Food and Agriculture Organization of the United Nations) dataset This module is not just for parsing data, we also have a need to write files in JSON format with formatted data inside.

Show some love if you want more articles like this one! any activity will be appreciated.

Similar articles:

ย