My name is Melvin and I am currently working on an MVP for a web service called View, to make it easier for developers to upload, process, and deliver media in their apps.
The idea came to mind while I was working on a photo-sharing app and noticed first-hand how needlessly complex and expensive existing services are.
I wished someone would create an easy-to-use and affordable API/SDK ala Stripe but for audio, video, and images.
Is this something that was a pain point for you? I'd love to hear about your experiences building apps with user-generated content.
Cheers
Auser authentification.
What is everyone using for this? How do you turn a static website where a user can set some configurations (say the color scheme) into a site where the user can log in and save their settings?
In the past I rolled my own solutions. But for new projects I am considering to use a library or framework.
I guess Django, Flask, Laravel, Symfony and Express all come with some default auth mechanism. How is HNs experience? Are you using these? Are you happy with them?
I did a quick test like this:
apt install -y python3-django
django-admin startproject mysite
cd mysite
python3 manage.py migrate
python3 manage.py runserver 0.0.0.0:80
And it serves a Django site but there seem to be no routes for users to sign up, log in etc.The urls.py file contains only this one route it seems:
urlpatterns = [
url(r'^admin/', admin.site.urls),
]First of all, it is a service. That opens a can of worms on its own. The main issue being that you are 100% at the mercy of the service provider. Every time they decide to change their API, it adds maintenance work to your project. Sometimes it happens on short notice which can be very annoying.
Second, it is a strange mix of service and code. I don't see an easy way to use it without their Javascript SDK. And installing that SDK looks complicated from what I see.
- Firebase is a juggernaut of a library on the frontend: https://bundlephobia.com/package/firebase@8.10.0 Our product's main feature is being fast - it has the word "instant" in the name. For a product with users around the world, many of them will feel that network request and/or increased execution time. This is fine at the prototype stage or for teams which don't have the resources to implement a case-specific auth implementation that'll be lightweight and efficient. In our case we felt kind of stupid; this should have been clear to us from the beginning, but we wanted to move quickly. Ultimately this cost us time, and to be frank, that was mostly on me!
- Integration with other systems wasn't always smooth or simple. The Firebase Admin documentation left a lot to be desired. There are a lot of quirks all over. Some fields during some authentications might be empty for example, but it wasn't clear why or what this meant - this meant a lot of deep-diving and experimentation. We were using the official Go library. Sometimes we could use the library, other times we'd need to write a request out to Google's APIs. We made a lot of passes to improve on this thinking we must be missing something, but after hitting various existing issues online where developers dismissed the problem, it became clear that this is just life for a Go server supporting Firebase Admin.
I do recall that Android and Node.js support appeared much better, so if you suspect you're using a better-supported ecosystem, maybe this won't get in your way.
- Something that a better developer might understand and navigate than I did was the lack of assurance of data being present or its structure being consistent. Fields coming back for users seemed slightly inconsistent (not a problem for most requests). I wrote a parser for each provider to normalize data because occasionally we'd be missing the user's email or something. For example, as I recall, getting the email from a Twitter-authed user could be different from getting the email for a Google-authed user. I'll admit we had other issues to face so I spent the least time on this, then we dropped Firebase before I could revisit.
- Like any off-the-shelf solution, it ends up having significant limitations. This can be a great thing too, but for us it was a deal breaker. You can assign metadata to auth profiles but this felt too flimsy to us. I think this is a general Firebase issue, not specific only to auth: data integrity is poor. It was an ever-present problem that we couldn't attach users to our other data with rock solid guarantees. It felt like auth was almost ephemeral in our stack, and without fully owning it, it was as though our users weren't the cornerstone of our application but a floating member out in space.
Despite all of that it's a great product and I highly recommend it if it fits your needs. I'm not slamming the developers behind it. I think they're well aware that it's not for everyone, and they've done a great job making it work for as many people as it does.
What would be the minimum number of commands I have to type into a command line (Say in a fresh Debian install) to get a simple site up and running that lets users do the basic stuff like sign up, log in, log out, change password, delete profile?
I don't have experience with those frameworks, but the rails equivalent, devise, is very good. I assume it's the same for any mature framework.
He is not asking for trouble..
an MVP for a web service called View
Nitpick: if someone types in "user-generated content view" to their favorite search engine, they're not likely to find you.I suppose running a blog with good content might be helpful so that someone might find the service using a search term like "video streaming api" or similar.
On topic, you could try attaching a domain specific word to the end until you’re big enough to take over the generic one, ala Flock Freight, Cured Health, Glow Credit etc. Lot of those lately.
The way it works is that platform uses our SDK through which they "send" all uploaded content. The SDK generates a fingerprint that is sent to our service and a license is issued. The license is non/permissible, so they instruct platform and the uploader (creator) what to do. We also provide payment distribution, usage reporting and ADR (alternative dispute resolution).
If platform uses our service, we indemnify them from any liabilities under DMCA, EUCD and others up to $50M in damages (legal cost and/or court orders).
We charge % of revenue generated by the platform for our services. If platform generates no revenue, the service is free. We don't charge per lookup nor there is any scale limit (well, there are throughput limits, but they are quite far for most platforms [currently we can search around 1.1k hours of content every second]).
We cover video, sound recordings and compositions. Images are coming late this year and text sometimes next year.
Just a quick note on the moderation. The benefit of our structure is the moderation is "outsourced" back to law enforcement and gov agencies, like NCMEC and FBI in US, Bundespolizei in Germany, etc.). This means platforms don't have to hire people to moderate the covered content, because the liability is transferred back to the organizations that are create it in the first place.
[0] https://pex.com
We built our adminless Internet forum [1] for the same reason. It's splendid on the dev side because initially we built traditional moderation tools (they still exists in the Git), but then we deleted it all and it felt much simpler.
Still waiting to see if it will work from a moderation standpoint. The site's been live for almost a year with no issue yet.
Edit: It's a text only forum, which makes this a lot easier. I was thinking about allowing images but with a high per image fee.
Politics has a way of turning into violent threats, pictures of nooses, etc.
Then there is the spam, actually that comes before the pornography.
As mentioned, there is child porn, various other forms of illegal porn, illegal violence, hatred, propaganda to incite violence, etc... Then of course any kind of content you might not approve of due to any personal or company policies, if relevant.
All of our material will be forced to be public so the user should have no expectation of privacy, but for many people this won't matter. Just look at Facebook; that's a public-facing service, and people upload atrocities to it constantly.
You'd do well to look at established services. Craigslist's list of prohibited content, the T&C of Facebook, Twitter, Reddit, etc., are going to be useful.
Just off the top of my head:
Cyberstalking, pornography, child porn, piracy, malware, fraud, illicit goods (guns, drugs, black/grey market, stolen goods), intimidation, gangs, bullying, alcohol, tobacco, prescription medications, hoaxes, various ineefective / "alternative" products and remedies (which themselves run the gamut of legality, even defining this is at best difficult), advertising, advertising for protected or regulated sectors / goods (housing, employent, personal and professional services, beauty care, escorts, security services, licensed professional, ... As with the goods section, this rapidly gets complex), legal services / aid, political activities, fomenting revolutin / freedom fighters.
User-generate content is a massive concern.
One concept I'm seeing getting increased traction is a focus not on the quantity of posted content but the prevelance or level of access or views. Facebook and YouTube especially are increasingly discussing problematic content not in terms of posts or videos, but of views or presentations of those.
This ... starts making trade-offs in moderation much more viable, principally because there is an inverse logarithmic relationship between the number of items and the views: If n items gets n views, then 10*m items get n/10 views. Very roughly.
This means that you can set a goal in terms of the number of items viewed (and see what the maximum unmoderated prevalence will be), or target a specific prevalence and determine how many reviewers will be required.
For human moderators, the number of items reviewed per day seems to be in the 500--800 range. Note that 800 items/day in 8 hours is 100/hour, or 1.6 per minute, or 36 seconds per item. That's inclusive of breaks, overhead, and non-moderation tasks.
Moderation itself is a very psychologically loaded task. You'll either want to rotate people through it from other functions, or see a heck of a lot of staff turnover.
If anyone has greater insights from one of the current large UGC services (FB, Twitter, Instagram, Whatsapp, TikTok, Imgur, Reddit, etc.), I'd really like to know what current internal practices are.
Some of my previous work had some incidental exposure to this area (I was tasked with removing identified content, working on both our internal and external CDN provider to do so). After a couple of spot checks to see I was unlikely to be deleting content which wouldn't meet removal criteria, as in literally two, I decided I simply didn't want to take the risks of performing additional checks. My removal process turned out to be quite effective --- what the CDN provider's specs suggested might be a weeks-long process removed some millions of items over a weekend. That was on what is by current standards a very modest-sized social network.
I've written on this previously citing YouTube and Facebook sources here: https://joindiaspora.com/posts/f3617c90793101396840002590d8e...
They offer a pretty generous free tier for personal stuff as well, although I wish they had some plan that was between the free tier and the $99/month cheapest one, which is quite a steep increase from paying nothing.
Can you give some examples?
Out of everything I would consider 'complex', handling media wouldn't even make the top 1,000 and services like AWS mean you can store petabytes for pennies.
None of this is a simple task, and you also need to serve the correct files to the correct clients - fun!
Thats been the hardest part for me as a developer. I've had several extended debugging sessions that ended up being "the media is encoded incorrectly for this android device".
Someone woke up one day with a $8,000 bill while encoding videos:
https://github.com/awslabs/video-on-demand-on-aws/issues/48
There's also this pretty architecture overview in the repo:
https://github.com/awslabs/video-on-demand-on-aws/blob/maste...
In my opinion, it's too complex and I'd prefer a more simple solution to get to market faster.
Imagine adding just a few lines of code to get started instead of setting up all these things on AWS, where you might end up paying up to $0.12/GB for outgoing bandwidth as well.
Some folks absolutely need to build their own video encoding pipeline for one reason or another--but the happy path's pretty well-established for folks who just Need Some Video, IMO. At Mux (full disclosure: I'm on the DevEx team there), getting up and running with video is one API call to CreateAsset, followed by a HTTP PUT if your video file isn't already accessible via HTTP somewhere. IMO, hard to be simpler than that.
AWS is expensive, but you're right in that it doesn't have to be. Measuring like-for-like is difficult, and per-minute rather than per-GB pricing tends to make more sense for most developers, but for 1080p content Mux is usually around half the price of AWS IVS. Which, having done this before at a previous job--if you want to stay in all-AWS land, is a way better call for your sanity than trying to hack it out yourself.
Secondly, you might need to scan the content somehow, again for malware and possibly allow other users to report it.
Of course there's the issue with copyrighted and illegal material - there should also be a way to report this, or detect it. I guess it means you need to be aware of and comply with the server country regulations which can be tricky in some countries.
>are you also encoding videos for adaptive streaming?
Can't help you with that, I only worked with images, best of luck!
If video is of interest, I'd definitely take a look at that and see if it meets your needs. It's one of those specialized serverless offerings from AWS that you pay as you use.
Vimeo went into the opposite direction. They have a tiered pricing model charging for the use of the features and tools they offer. However, they pivoted away from catering to a large and diverse audience. Instead they focus on B2B communication. Vimeo is excellent for professional videographers or marketing agencies publishing video content.
The take away here is that the expense of processing and publishing video isn't going to lower dramatically in the short run. As hardware became ever more powerful over the past 2 decades, the demand for high-quality video has paced along in lockstep: 1080p, 4K,... So, the costs associated with hosting high quality content haven't dropped significantly in that regard. There's not much you can change about that.
What you do control is the business model you develop to cover the costs of hosting content. And that means finding a profitable market and asserting a good product/service fit up front, if you can.
Instead of building an app "to make it easier for developers to upload, process, and deliver media in their apps", OP ought to think beyond developers, and rather direct themselves to those who are either producing or consuming content.
Another way of thinking about this might be: Which problem really is getting solved? And whose problem really is it? Is it hosting audiovisual bitstreams? Is it processing? Is it just providing an at-the-ready API which allows users to just easily upload and embed audiovisual material everywhere without even requiring any technical knowledge?
The latter, by the way, already exists in several forms on the Web - e.g. https://oembed.com/ - which is already implemented in e.g. WordPress and supported by large social media platforms such as Instagram and YouTube.
How do you explain Google somebody else is responsible when they decide to nuke your app?
Building the upload/delivery stream was easy, it's all the 'needless complexity' that adds value. You know, like privacy controls, access control, moderation, image formats and optimised delivery, indexing, search, tagging, etc.
I'll probably reimagine it and find better ways of hosting content but with my vision and UX. Maybe Cloudinary as folk have suggested, maybe some other SaaS DAM product with a solid track record - I doubt I'd trust an early stage MVP if my customers need to rely upon it. Too risky for my tastes.
Now if there was a moderation-as-a-service (MaaS?) that would have uses at the right pricepoint.
We were paying $100,000 a month storing user-generated content, and now are paying about $10,000 a month after migrating
I had the exact same idea as you, and shelled out a few hundred dollars for a domain from a squatter. My prototype basically reinvents fault tolerant resumable uploads (like tus.io). On the backend, it streams the file to Wasabi and Backblaze. That's as far as I got. Video/audio scares me, but I'll get to it eventually.
I really like the content moderation as a service (via AI, or humans) idea that others have mentioned.
Very similar to how I'd handle toxic waste. I'd touch it as little as possible, and ideally I'd like it to be someone else's problem.