To make matters worse, many high-value Big Data analytical problems are (literally) not meaningfully visualizable except for marketing purposes. It is rather tricky to visualize an analytic product when there are a hundred critical values that need to be rendered in some fashion for every pixel your monitor can display. A lot of high-value analytics have this characteristic but most of the nominal Big Data visualization tools ignore this case even though it is arguably the most important one.
Consequently, while labeling your startup "Big Data" is trendy and fashionable, there are very few genuine Big Data startups. Adding value in this market requires a combination of serious theoretical computer science chops plus very creative interface design. Few startups are actually addressing the needs of this market and are instead assuming the market wants the web app they have the skills to produce.
Stunning visualizations and a better web experience are definitely something we want to do when timing allows, but so far our true customer value is in the backend and our APIs.
We got completely blindsinded by the idea of a "big data product" and we quite naively replaced "product" by "trendy webapp with shiny charts" in our minds, whereas there is definitely the place for a technological product adressing those needs, but those are deep tech product paired with services because they require setup & maintenance etc... For instance our friends at infochimps do it really really well, and they're on a good path to a big data product.
For example, one thing I'm being forced to implement myself has been a lot of string manipulation operations to sanitize different kinds of data I'm playing with in spreadsheets.
Even just having something misimported wastes a lot of time.
OpenRefine isn't bad, but can only get you so far. That being said, if I can come up with a complete solution myself, I wouldn't mind just adding it to the suite of tools I'm already offering :)
I'm also wondering about different kinds of tools already out there though.
(Real question. I am developing this business as we speak).
This is an example of an industry vertical analytics solution. Our hypothesis is that analytics products will be created to support all kinds of use cases like this (insurance analytics, manufacturing analytics, distribution analytics, etc). We want Keen IO to be the platform on which people build those solutions.
The way we can help you is to first make it easy for you to reliably collect data from all your dentists (with client libraries). Then we expose all that data, and of our analytics capabilities (e.g. queries), by REST API. You can log into Keen IO and create a line chart, then copy and paste the javascript right into your site. Now you can build a completely white-labeled website for your dentists, while we take care of storing and querying your event data. Our scoped API keys will allow you to secure the data so a given dentist can only see data for her offices. You can also create an internal dashboard for your team so you'll able to do analysis across all the dentists. Hopefully you'll discover industry trends and benchmarks that you can use in marketing reports or to resell to the dental industry (assuming your dentists allow this!). I bet they would like to know how they compare to other dentists...
Anyway, this is too much fun. Let me know if you want to brainstorm more!
But mainly I want to say I'm super impressed with the Google Analytics 'storification'. I can imagine the difficulties in bringing that level of quality to myriad data sources, but I'm excited to see you succeed!
Just verify the expected variables/methods exist prior to using them (lines 99,197,365 right now):
if(hbspt) hbspt.cta.load(
^ added ^