E-Commerce Data Wrangling

Owning an E-Commerce store is a lot like owning a home. Both require a sound construction plan, regular maintenance, and a fundamental understanding of the various underlying components that define them. I often tell clients that just as they should know how to do things around the house like turn off the gas line and clean the gutters, they should also know how to perform basic tasks on their website like writing simple HTML and optimizing images.

This is true for owning an existing E-Commerce site, but what about migrating to a new one?

Most people at one time or another have moved houses. The problem with moving starts the moment you make the commitment to move. You look around at the living room, bedrooms, kitchen, garage, and there is only madness. So it begins. You start filling up boxes labeled “Kitchen”, “Fragile”, and “Mom’s”. This seems like a fine tactic as you wildly tear through rooms of the house putting things in their appropriate boxes. Before you know it, you have dozens of boxes, all labeled differently with barely a semblance of rhyme and reason. Things that are “Mom’s” go in “Mom’s” box, but what about something that is also “Fragile”? Which box does it go in?

Humans tend to have a strong emotional connection with things like “Mom’s Fragile Trinket”. When it comes out of the box in the new house, it is quickly recognized for what it is and placed in the appropriate location in the home. This fact masks the potential confusion caused by the arbitrary naming of each box. So what if something could have been placed in one of several boxes? It wasn’t, and now it’s out of the box, and we know right where it goes.

Now take a moment to consider the strength of your emotional connection to each individual row of datum among thousands of other rows.

Let’s look at a very crude example of customer data:

name country email
Elmo Cunningham Romania [email protected]
Ryan Vazquez Greenland [email protected]
Giacomo Wolfe Solomon Islands [email protected]
Abel Mcneil Guatemala [email protected]
Chaney Bryan Greenland [email protected]
Addison Simon Liechtenstein [email protected]
Brennan Owens Taiwan [email protected]

Do you have the same connection to “Mom’s Fragile Trinket” as you do to Abel Mcneil from Guatemala? Did you even know he was a customer of yours? Likely not, as a single entry from a massive data table is going to look just like every other entry, especially if you’re the poor soul who gets to enter all this data in a spreadsheet.

What is “Data Wrangling”?

The term data wrangling comes from the idea that various data sources sometimes need to be rounded up (wrangled) and standardized. You might have one customer list from your Point of Sale (POS), another from an email campaign, and yet another from a legacy accounting system. Each list was generated for a different (but possibly similar) purpose, so the fields each contains are not likely to match up exactly.

The data wrangler will take these disparate data sources and wrangle them into a single data table according to whatever the requirements may be. Much like someone wrangling a bunch of cattle on a ranch, this is not sexy work and is often something that is postponed or ignored because nobody wants to do data entry.

Back on the analogy of a website being like a home, thorough data wrangling might be analogous to having a well organized bill of materials when beginning a new project. As a developer beginning a new project, one of the first things I am going to be curious about is what the client sells. How are these products modeled? When considering my platform of choice, Magento, there is an out-of-the-box standard for how a product is modeled. There is a name, a SKU, a description, a price, inventory, URL, etc. These are maintained for all products. More customized product catalogs will then include additional attributes that are specific to the product. A leather sofa might have a color, a material type, optional ottomans, etc. A television will have a size, descriptions of various technologies, and so on.

When we were boxing our house up to move, we didn’t care a whole lot about deciding where to put “Mom’s Fragile Trinket”, does it go in “Mom’s” box or in the “Fragile” box? Not a big deal because of that emotional connection. When, however, we go to import a product catalog to a new site, we will absolutely 100% care about getting each and every data entry correct and assigned to the correct “box”. The data wrangler likely has zero emotional connection to the store’s products. They might not even know what is sold or who the client is.

What Does It All Mean?

Ultimately, it pays off to have a sound plan for migrating data before you even know what you will be migrating the data to. Here we are talking about an E-Commerce site, but really, the applications for your data are endless.

  • Synchronized customer lists between POS, E-Commerce, CRM systems
  • Product search index usable by any compatible application
  • Passing order data from an E-Commerce site to a dropship vendor
  • Integrating a new product line to an existing sales channel.

At Fruition, we recognize the complexities and intricacies of all types of businesses and their custom data models. We can help you to not only build a more efficient E-Commerce business through a detailed look at that data you’re storing, but also a more efficient business.

Post by Preston Spahn, Magento Developer


Related Posts

5 Essential Magento Extensions for Your E-Commerce Business
Should Magento be Your Go-To Platform for E-Commerce?
3 Resources for Amazing Mobile Commerce Stats


Leave a Comment

Your email address will not be published. Required fields are marked *