This experience, along with observing similar open-source projects in other domains, has made me wonder about the broader impact on SaaS.
I'm curious to hear the community's thoughts on whether open-source alternatives are genuinely challenging SaaS dominance, and if so, in what ways?
What trends or examples have you noticed in your field?
Multiwoven, our Open Source alternative to Hightouch, Census and RudderStack, has always been about making data available where it's needed.
We've added a new AWS S3 connector as a data source to Multiwoven, This data source connector has been a highly requested feature from the community.
We believe we've not only added AWS S3 as a data source, but also optimized the performance of querying data stored in S3 buckets.
We've integrated DuckDB, an in-memory analytical database, to provide fast and efficient SQL query execution on large datasets directly in S3.
-> Features:
1. IAM and Role-based Access - Securely connect to AWS S3 buckets using IAM or role-based permissions.
2. File Format Support - Native support for CSV and Parquet file formats.
3. DuckDB Powered Performance - Utilizes hashtag#DuckDB, an in-memory analytical database, for fast and efficient SQL query execution on large datasets directly in S3.
4. Native SQL Interface - Execute SQL queries directly on data stored in S3 buckets, eliminating the need for intermediate scripting steps or data movement to a separate database.
-> Use Cases:
* Query and Transform - Convert ML model batch results stored in S3 buckets into actionable insights.
* Sync Data - Sync log data or event streams from S3 to business applications like Salesforce, Google Sheets, or other destinations for real-time analytics.
[ https://github.com/Multiwoven/multiwoven ]
Refer to our GitHub repository for more information & hit the star button to show your support :)
It's been a great journey so far, and we are excited to announce a major update to Multiwoven - our new release, Multiwoven 0.2.0, is now available!
Repo: https://github.com/Multiwoven/multiwoven
This release brings a host of new features, enhancements, and bug fixes to streamline data syncs and user experience.
From new connectors to advanced reporting dashboards, as a team, we have been working hard on these updates based on the feedback and requests from our customers and the community.
- 10+ new connectors added to Multiwoven, including Databricks Warehouse, Postgres, Hubspot, Google Sheets, Airtable, Salesforce Consumer Goods, Stripe, SFTP and more!
- Introducing the Multiwoven Reporting Dashboard, Templating Dashboard, and Sync Listing Page for better visibility and management of your data sync operations.
- We have also addressed several stability issues, introduced rate limiting, and added batch support to enhance platform performance when it comes to data sync operations.
We are excited to see how the community uses these new features and connectors to build powerful data sync workflows.
Repo: https://github.com/Multiwoven/multiwoven
Give us a star if you like what we are building!
We wanted something that was durable & that could handle retries, our initial thoghts were to use Rails Sidekiq, but the lack of durability and the need to handle retries made us look for other options, also we didn't want to create a dependency on Redis.
When we first did a POC with Temporal IO, we were amazed by the performance of running long tasks using Postgres as data store, we were able to benchmark by running a workload that tool 3-4 minutes to complete and processed 100K records from a data warehouse. Of course, we had to make some changes to our code to make it work with Temporal IO, but it was worth it.
it also provided a nice looking UI to monitor the Syncs and the ability to retry failed tasks, which was a big plus for us.
If anyone looking to build a long running workloads, I would highly recommend Temporal IO, it just works.
Link to our Github Repo: https://github.com/Multiwoven/multiwoven
Data movement has become a critical part of the modern data stack. more and more companies are evolving within the data movement space. The entire data movement landscape can be broken down into below product categories:
- Data storage:
1. Data warehouse: Snowflake, BigQuery, Redshift, Databricks, and a few others.
- Data Management:
1. ETL: Fivetran, Stitch, Airbyte, and a few others.
- Data Activation:
1. Reverse ETL: Hightouch, Census, Rudderstack, and a few others.
2. CDP: Salesforce, Segment, mParticle, and a few others.
While Data storage and data management have been around for a while, the data activation space especially the reverse ETL and CDP that are warehouse native are relatively new.
Today more and more teams are aggregating customer data from multiple sources into a data warehouse or data lake. The large data warehouse companies like Snowflake, BigQuery, Databricks, and Redshift have grown significantly in the past couple of years.
Aggregating data into a data warehouse is only the first step. while business functions like marketing, sales, and customer support need to access to this data data which is often back-office and not easily accessible.
Multiwoven is an Open Source CDP & Reverse ETL for your data warehouse. It is lightweight because you can cheery pick the features you need and it and run it for specific use-cases.
Ex: If you need to sync data from your data warehouse to your CRM, you can use the reverse ETL feature.
If you need customer360, and audience segmentation, you can use the CDP feature.
Open for early feedback and contributions. Repo: [https://github.com/Multiwoven/multiwoven]
We have a background in product and engineering for Asia population-scale customer data infrastructure and applications - across Affle (3 Bn+ connected devices for consumer intelligence-based marketing), Truecaller (~400 Mn. MAU) and Razorpay (India's largest payments platform).
We built Multiwoven as an open source reverse ETL tool for data activation. After our recent repo launch, Multiwoven started being used by data teams to eliminate the complexity and tediousness of building and maintaining data pipelines to third-party biz/marketing/sales tools.
From our early users and discovery conversations, we quickly started seeing users who have a data warehouse use us like a CDP on top of their warehouse. Many of them were customers and prospects of Salesforce (CRM and/or marketing cloud).
The simple solution that worked for them -
1. Easily works with all their tools (not just Salesforce's ecosystem). We are adding roughly two new Connectors every week, and it's also easy to build or customize for someone's own needs (without going through long and expensive professional services)
2. Not being compelled to invest upwards of half a million dollars a year on Salesforce CDP. The Salesforce CDP Starter Pack is at $110K, and doesn't include Segmentation, Audiences and Ad integrations (imagine a CDP without these!)
3. Being able to easily self-host the software (that essentially has access to all their warehouse data) and not worry about jumping through Compliance/InfoSec's hoops.
Gartner recently naming Salesforce CDP a leader in their first Magic Quadrant for CDPs is certainly surprising. We're not sure that users feel the same way. We recently released a connector to Salesforce CRM, and Salesforce Marketing Cloud is on the way. We are continuing to furiously build and working to meet users' needs.
As a young project, we'd love your feedback. https://github.com/Multiwoven/multiwoven
Today, most companies are unifying their customer data in a data warehouse. Data warehouses like Snowflake, BigQuery, Redshift, and others are becoming the single source of truth for companies and growing rapidly.
Naturally, companies want to make use of this data, and a CDP implementation becomes a natural choice.
But what companies fail to understand is that data control and ownership is a critical aspect of a CDP, companies tend to give away access to their datawarehouse to CDP vendors, which could lead to data privacy and security issues.
Today, open-source is growing rapidly, and companies are looking for open-source alternatives to every solution. Self-hosting and owning control of the data is a most preferred choice for companies.
That's exactly we are leading the charge with the first open-source warehouse native CDP.
Repo Link - https://github.com/Multiwoven/multiwoven
We are happy to see that Ruby is still the language of choice for many developers and companies. We can't wait to see what Ubicloud will bring to the Ruby community.
Kudos to the massive dream!
https://techcrunch.com/2024/03/05/ubicloud-wants-to-build-an-open-source-alternative-to-aws/