This experience, along with observing similar open-source projects in other domains, has made me wonder about the broader impact on SaaS.
I'm curious to hear the community's thoughts on whether open-source alternatives are genuinely challenging SaaS dominance, and if so, in what ways?
What trends or examples have you noticed in your field?
Multiwoven, our Open Source alternative to Hightouch, Census and RudderStack, has always been about making data available where it's needed.
We've added a new AWS S3 connector as a data source to Multiwoven, This data source connector has been a highly requested feature from the community.
We believe we've not only added AWS S3 as a data source, but also optimized the performance of querying data stored in S3 buckets.
We've integrated DuckDB, an in-memory analytical database, to provide fast and efficient SQL query execution on large datasets directly in S3.
-> Features:
1. IAM and Role-based Access - Securely connect to AWS S3 buckets using IAM or role-based permissions.
2. File Format Support - Native support for CSV and Parquet file formats.
3. DuckDB Powered Performance - Utilizes hashtag#DuckDB, an in-memory analytical database, for fast and efficient SQL query execution on large datasets directly in S3.
4. Native SQL Interface - Execute SQL queries directly on data stored in S3 buckets, eliminating the need for intermediate scripting steps or data movement to a separate database.
-> Use Cases:
* Query and Transform - Convert ML model batch results stored in S3 buckets into actionable insights.
* Sync Data - Sync log data or event streams from S3 to business applications like Salesforce, Google Sheets, or other destinations for real-time analytics.
[ https://github.com/Multiwoven/multiwoven ]
Refer to our GitHub repository for more information & hit the star button to show your support :)
It's been a great journey so far, and we are excited to announce a major update to Multiwoven - our new release, Multiwoven 0.2.0, is now available!
Repo: https://github.com/Multiwoven/multiwoven
This release brings a host of new features, enhancements, and bug fixes to streamline data syncs and user experience.
From new connectors to advanced reporting dashboards, as a team, we have been working hard on these updates based on the feedback and requests from our customers and the community.
- 10+ new connectors added to Multiwoven, including Databricks Warehouse, Postgres, Hubspot, Google Sheets, Airtable, Salesforce Consumer Goods, Stripe, SFTP and more!
- Introducing the Multiwoven Reporting Dashboard, Templating Dashboard, and Sync Listing Page for better visibility and management of your data sync operations.
- We have also addressed several stability issues, introduced rate limiting, and added batch support to enhance platform performance when it comes to data sync operations.
We are excited to see how the community uses these new features and connectors to build powerful data sync workflows.
Repo: https://github.com/Multiwoven/multiwoven
Give us a star if you like what we are building!
We wanted something that was durable & that could handle retries, our initial thoghts were to use Rails Sidekiq, but the lack of durability and the need to handle retries made us look for other options, also we didn't want to create a dependency on Redis.
When we first did a POC with Temporal IO, we were amazed by the performance of running long tasks using Postgres as data store, we were able to benchmark by running a workload that tool 3-4 minutes to complete and processed 100K records from a data warehouse. Of course, we had to make some changes to our code to make it work with Temporal IO, but it was worth it.
it also provided a nice looking UI to monitor the Syncs and the ability to retry failed tasks, which was a big plus for us.
If anyone looking to build a long running workloads, I would highly recommend Temporal IO, it just works.
Link to our Github Repo: https://github.com/Multiwoven/multiwoven
Data movement has become a critical part of the modern data stack. more and more companies are evolving within the data movement space. The entire data movement landscape can be broken down into below product categories:
- Data storage:
1. Data warehouse: Snowflake, BigQuery, Redshift, Databricks, and a few others.
- Data Management:
1. ETL: Fivetran, Stitch, Airbyte, and a few others.
- Data Activation:
1. Reverse ETL: Hightouch, Census, Rudderstack, and a few others.
2. CDP: Salesforce, Segment, mParticle, and a few others.
While Data storage and data management have been around for a while, the data activation space especially the reverse ETL and CDP that are warehouse native are relatively new.
Today more and more teams are aggregating customer data from multiple sources into a data warehouse or data lake. The large data warehouse companies like Snowflake, BigQuery, Databricks, and Redshift have grown significantly in the past couple of years.
Aggregating data into a data warehouse is only the first step. while business functions like marketing, sales, and customer support need to access to this data data which is often back-office and not easily accessible.
Multiwoven is an Open Source CDP & Reverse ETL for your data warehouse. It is lightweight because you can cheery pick the features you need and it and run it for specific use-cases.
Ex: If you need to sync data from your data warehouse to your CRM, you can use the reverse ETL feature.
If you need customer360, and audience segmentation, you can use the CDP feature.
Open for early feedback and contributions. Repo: [https://github.com/Multiwoven/multiwoven]