Heres the short pitch.
The Securities and Exchange Commission (SEC) keeps record of every company in the United States. Companies whose holdings surpass $100 million though, are required to file a special type of form: the 13F form. This form, filed quarterly, discloses the filer's holdings, providing transparency into their investment activities and allowing the public and other market participants to monitor them.
The problem though, is that these holdings are often cumbersome to access, and valuable analysis is often hidden behind a paywall. Through wallstreetlocal, the SEC's 13F filers become more accessible and open.
By exploring the website (and the code), you can see the resources I used, check out some notable money managers I listed, and download any data that suits you. All for free. (Note, the mobile site likely needs work.)
I made this project to better democratize SEC filings, and also to get some experience on my hands. I love computers, and one day hope to get involved with startups. In the comments, I'd appreciate any and all advice, as well as feedback on how to improve the site.
Great project though! Opening up these sort of semi-encumbered datasets is what keeps humans well informed
I'm from MI as well and always wondered about deeper datasets for watching money and influence change hands
I agree with what you're saying though, it would be cool to take something like those property owner layers and find the ultimate legal entity for stuff like LLCs owning land.
E, only semi-related: In the 'urban design' part of the internet I've seen really cool mapping that puts bar charts on top of a city's grid, with the bar charts being how much tax revenue the city generates. It's really stark to see skyscraper-sized bars in downtown cores and mostly flat all around where cities have zoned residential separate from commercial, or even where suburbs tax less than the core city.
Gaia GPS [1] is the one I use. It's got a lot of layers for free including land ownership which properly shows all of my neighbors plots. I use it often when collecting rock specimens and mushrooms to make sure I'm on public land that allows it or to figure out who to seek permission from.
>> Every company in the United States
Sorry for being so fussy but I highly recommended changing the word 'company' / not using it in the future, as the title is quite misleading. No private company in the US has to register with the SEC or has to file with the SEC. 'Investment advisors', who also go by other aliases like 'asset manager', have to file a 13F filing only if they a) are registered with the SEC due to fund marketing purposes and b) if they have, as you already mentioned, over $100 million dollars under management (not 'in holdings'). This is also why large family offices (.e.g. Bayshore Global Management of Sergey Brin) won't show up in any SEC records as they meet the second but not the first criteria - same goes pretty much for any non asset-management company (e.g. McDonald's) as they do not raise money for fund vehicles. However, you have take this into account on your website, and further below you wrote "money manager" which is correct in finance jargon.
I hope this gives you a better understanding, keep up the great work.
The reason for this 13F filing, as you already guessed to some extent, is that Nvidia is a publicly traded company. As such it is subject to a wide range of SEC filings including those from section 13[2].
Nvidia seems to be a rare case. Acquiring public equity, as a company - especially as a public one, just for the purpose of managing concurrent assets, is very unusual but not out of the question - just away from the textbook. Given the fact that its ARM, whose acquisition failed before Nvidia filed the 13F, it could also serve some other purpose, e.g. showing that interest is still present.
[1] https://www.sec.gov/divisions/investment/13ffaq#:~:text=Bank....
[2] https://www.legalandcompliance.com/securities-law/sec-report...
Thanks for informing me, and for the kind words, I'll make sure to avoid using the word company from now on.
Anyhow, are you downloading the sec filings locally and processing them? It can be a lot of files! The EDGAR database has a lot of files in there. I download stuff daily, add to sqlite, and then process into various other things. I had to do some app side compression as the sqlite file gets big!
Other than the search database which needs to be available on demand, the rest of the filers are queried and analyzed on demand. This is required because some filers get really, really big. Blackrock Inc alone is about a 30MB file.
Right now I scan the list of daily filings, and for every cik I mark them as dirty. Another pass looks at ciks marked dirty, and downloads the xml or json of the filings. I then scan for anything new, create filing rows and mark them as dirty so to speak.
Annual reports can be big, and for my purposes I don't need them so I skip them. The HTML of these things is garbage! But a lot of financial data is in the company facts which is nice and clean.
The main feature I added because of WhaleWisdom was the data download. For any filer you can download all data in CSV or JSON, as on WhaleWisdom that's unavailable. I also plan to add a bunch of features those sites have eventually.
Tiger Global: https://last10k.com/sec-filings/1167483
Ruane, Cunniff Goldfarb: https://last10k.com/sec-filings/1720792
504: GATEWAY_TIMEOUT
Code: FUNCTION_INVOCATION_TIMEOUT
If the data changes irregularly, you’re probably better off making it a static site and having a script update it periodically, also to avoid excessive cloud charges (since you seem to be hosting this on Vercel).The problem with making a static site is that there are over 800,000 SEC filers, so it would be impossible to query all of them and store it.
I hadn't expected so much traffic, so I really have no clue how to handle this without excessive cloud charges. The best I've done so far is to look into free hosting for open-source projeccts and add a donation link to the homepage.
The pre-processed data would have to be served from somewhere though. I'm not sure if GitHub could be used to host.
If not, Scaleway Stardust includes a bit of disk, 75GB free S3-compatible storage, and most importantly: free 100mbps outbound data transfer for < $4/month.
There are probably other cheap shared web hosts that claim unlimited data transfer but not sure they'll deliver.
https://www.data-liberation-project.org/datasets/
Awesome work!!
Here’s an idea to monetize it: implement collaborative filtering (example: funds in rows, assets in columns).
Once you do it, you will be able to cluster similar funds. Let’s say X funds have the asset A, but Y funds do not have it, and X and Y belong to the same cluster. Thus, if you recommend asset A to the Y funds, there’s a high probability they’d add it to their portfolio (if property pitched). This is roughly how Netflix recommends movies, Spotify recommends songs, etc.
A lot of players in the industry would pay high $$$ for a recommender system like that. And you already made 80%: it’s only missing the final machine learning part (which is the fun part :))
This should really be a lib that takes a folder of 13F forms and outputs a csv, or something like this. That's it. No need for webapps or whatever.
The problem I found though is that you can't really just use the raw filings. The data is much more useful when the stocks are queried with third party APIs and organized along with things like recent price data. This project alone uses three APIs, and while you could include that in a library and force the developer to get three different API keys, it just works better as a service.
If a "library" is all you want though, the API is available with documentation. There's also 13.info, an open-source project that predates this one. Although it is still a service, it is more like a library and could probably be used like one.
Do you have a plan to expand with new features and are you looking for contribution?
Will the database be updated automatically?
As for contribution, I am definitely looking for it. I have not been in the open-source field for long so I don't exactly know how to get it, but I would highly appreciate it if anyone could help.
Contribution would be especially helpful since I was still learning a lot of the technologies I used while I built this project, and the code is prone to newbie mistakes.
This is likely due to the fact that the database is huge, and providing that data on demand is very resource intensive- especially when there are forty different people sending many requests a second.
If you don't want to compromise on data though, look into spending a little bit more time/money on infrastructure. I wish I had deployed the project on Kubernetes, instead of what I ended up doing.
I have been tearing my hair out for the last hour trying to get my Always Free Oracle instance running again.
imo you're on a great path, stick with it!
This project was a way to use my web development experience for a good cause, and to learn a lot along the way. I hope to improve the project thouroughly though user suggestion, then maybe hand it off to other people. Like I said in my post, my main goal is to one day get into start-ups, so I can create good in the world through them. This is hopefully the first step in that journey.
Thanks for your kind words.
No. SEC keeps tabs on publicly traded companies, large institutional investors and the like. Hence the 'securities' and 'exchange'. Most companies in the US are private and not publicly traded. And most companies are not asset managers.
> and valuable analysis is often hidden behind a paywall.
Sure, but that's because analysis is valuable.
> I made this project to better democratize SEC filings
The SEC already makes the information publicly available via their edgar system.
> In the comments, I'd appreciate any and all advice, as well as feedback on how to improve the site.
I'm not sure who your target audience is. Finance companies already have internal teams pulling this kind data for them or they buy it from finance data providers. As for the general public, what need do they have for top 'filers'? As for me, I'd rather dump the data into a database or excel and query it rather than looking at a static page.
At the very least, don't make the site static. Try to add a spreadsheet. Or at the very least a sorting option. Also, vanguard, blackrock, etc aren't top filers. They are the largest or top asset managers. Maybe tie news stories to each asset manager? And more data? But as I said, not sure what the value here is for the end user. Why would I or anyone else ever use your site? Or better yet, what do you use the site for?
My advice is: Keep working, keep learning. If you love computers and want to work for a startup you absolutely have what it takes to make that happen. And if there are no startups near you which are right for you, you can found your own.
Some ideas:
* Cluster the 13Fs into buckets. Performance buckets and volatility buckets and aggressiveness buckets. * Build model portfolios that blend holdings from the best performers. * Do some basic regime analysis and find which 13F filers perform best during the various regimes.
Why did they vanish from the list? Maybe a data bug?
I used to have it bookmarked