Mike: open-source legal AI (opens in new tab)

(mikeoss.com)

202 pointsnoleary11d ago100 comments

100 comments

Legal professional here. This is NOT a replacement for proper legal AI assistants (e.g. Westlaw, in my jurisdiction). As far as I can tell, this is just a wrapper around regular LLMs i.e. nothing that you couldn't achieve yourself with the right prompting.

What legal professionals actually pay for, and that is virtually impossible to replicate unfortunately, is to give the AI access to a legal database of case law. Without case law, you can't do accurate legal research, and you are inviting disaster if you're doing things like drafting statements of case or skeleton arguments.

There's a reason why companies like Thomson Reuters have an oligopoly on these types of products, and can get away with charging thousands a year. They are the only ones with access to a comprehensive set of case law, and they've entrenched their position by having exclusive contracts with the law reporting companies. Without that, your model is just relying on publicly available cases that it can find on Google etc., and that's just a fraction of the full set.

With that said, these types of competitor products can be useful if you're just doing simple tasks like drafting letters or reviewing contracts and you accept that you need to do the legal research separately. But again, you can get that with just ChatGPT + a good prompt.

hn_throwaway_9910d ago

> There's a reason why companies like Thomson Reuters have an oligopoly on these types of products, and can get away with charging thousands a year. They are the only ones with access to a comprehensive set of case law, and they've entrenched their position by having exclusive contracts with the law reporting companies.

I'm not in the legal field, but can someone explain that further? I would have expected that all case law is public access. Not necessarily easy access, but when a judge writes an opinion, why on Earth would that opinion be gated behind a corporation? What am I missing?

jbstack10d ago

Yes, it sounds crazy and against the principle of open justice, but unfortunately this is the reality. Certainly in the UK which is my jurisdiction - and I believe in the US too although I don't know for sure.

In theory, any member of the public can obtain a judgment by applying for one at the court and paying a fee. That's fine if you just need a one-off judgment, don't mind paying the fee, and you're not in a hurry. It also assumes that you know which case you need.

For realistic legal research, you might need to wade through dozens of cases just to even know if any of them are relevant, you might have a deadline of tomorrow to get it done, and you might not want to pay that fee for a bunch of cases that you aren't going to end up needing. Only a company which already has a comprehensive copy of virtually every important case can help you here.

A typical workflow for a complex piece of legal research might look like this:

1. You need to research a legal topic.

2. Do some Googling, or chat to your LLM, to get a rough overview and some pointers for further research (but don't completely rely on what you find).

3. Read some professional content (e.g. Practical Law articles relevant to the topic, or a legal textbook).

4. Read the relevant legislation.

5. Use a legal database to download all the cases you found from steps 2 and 3 which seem like they might be relevant.

6. Use a legal database to download all the cases which cite the relevant legislative provisions you found in step 4 and seem like they might be relevant.

7. Use the legal database to confirm that those cases are still good law (not overridden or criticised by a later case).

8. Skim read them, discard those that turned out to obviously not be relevant.

9. Read the remaining ones more closely.

10. Note any useful-looking cases which are cited in the ones from step 9, and recursively work your way through those cases as well.

Relying on court-provided copies of judgments won't realistically help you with most of these steps.

hn_throwaway_9910d ago

Related question, then - what do judges use when they have to write opinions in the first place? Do they have to follow the same process and use Thomson Reuters?

It's obviously even more important for judges (compared to lawyers) to be able to easily search all of the relevant case law to see which cases are controlling and would have precedence. Seems bizarre to me that this critical function would be gated behind a corporation.

2 more replies

ericmcer10d ago

those steps are what a RAG LLM agent setup excels at.

If tech companies invested 10% of what they have in AI assisted coding tools into AI assisted legal tools, they would be able to do those steps easily.

It is definitely coming.

1 more reply

whattheheckheck9d ago

This all needs to be publicly accessible for free. Gonna see how blatently inconsistent laws and interpretations are

1 more reply

eagleal10d ago

Do not know about the US, but some countries publish some type of high profile cases but _only_ after anonimization for obvious privacy reasons.

To access the DB through the modern archive (well modern as new rules) you'd have to be an accredited professional passing through a few legal hardles and digital chancellor's office for each copy. It's like going to a bureaucracy^bureaucracy office.

Some early companies given their initial foothold were not required these checks so they were able to get hold of bigger archives (it's also important to remember often legislation or conformity is done through consulting or lobbying done by these entrenched players).

They can also build on the Data professionals themselves submit.

hn_throwaway_997d ago

> Do not know about the US, but some countries publish some type of high profile cases but _only_ after anonimization for obvious privacy reasons.

It doesn't work that way in the US. Legal judgements are documents of public record, and they are normally published in full - it's not uncommon to search for someone's name and see legal cases pop up that they have been involved in.

There are specific instances where a judge can seal a judicial record (and records for minors are sealed automatically), someone may petition for their own records to be expunged, and parties may ask for some information to be redacted, but these are normally (except in the case of minors) not done automatically. As I understand it, the US has much more lax rules around the publicity of legal proceedings than other jurisdictions. For example, even though someone is deemed "innocent until proven guilty", arrest records and mugshots are reported all the time in the media even though this would be illegal in many other areas of the world.

jonners0010d ago

This is great on this topic: https://www.thebignewsletter.com/p/gatekeepers-of-law-inside...

r0fl10d ago

In Canada there is a database of a lot of case law here: https://www.canlii.org/

Theoretically speaking if someone scraped all of it and added it to something like this open source Mike project would that then be a much better tool for lawyers?

jbstack8d ago

> would that then be a much better tool

Better than before, yes. Good for general legal work that doesn't require robust legal research, yes. Sufficient for full legal research, no.

The problem is that "a lot of case law" isn't enough case law. You need close to everything. Otherwise this can happen: Canlii case X -> Legal principle Y. Westlaw case Z not on Canlii -> X overriden, Y no longer good law. Or you might simply not find a case which cogently supports your argument, when one does in fact exist. Or, conversely, you are unaware of a detrimental case which your opponent knows about because they have Westlaw.

ferngodfather8d ago

We have bailii.org but it doesn't have everything.

dspillett10d ago

> Legal professional here. This is NOT a replacement for proper legal AI assistants (e.g. Westlaw, in my jurisdiction). As far as I can tell, this is just a wrapper around regular LLMs i.e. nothing that you couldn't achieve yourself with the right prompting.

I'd not use generative AI for anything but a cursory check anyway⁰. Even if it is trained on clean up-to-date data rather than all the wrong information that is out there, it could still give a wrong answer and I have no leg to stand on if I rely upon it. At least if I pay a human and they trust the LLM too much, I'll hopefully have some call to pursue them for giving bad advice when it bites me.

--------

[0] Or at all… But even if I wasn't someone actively avoiding LLMs, the point would still stand

daydream9d ago

https://case.law/

htrp10d ago

What does legora, harvey, or crosby add here other than the default westlaw/TR/lexis integrations?

david_shi10d ago

I'd imagine it's like using Cursor/Claude Code vs. a Jetbrains IDE plugin.

VoidCoefficient8d ago

To be clear, the value of services like Legora and Harvey was not as a replacement for legal-specific services like Westlaw. LLMs are absolutely abysmal at legal research and analysis, and every week we are seeing reports of some lawyer being called out publicly for submitting a brief to a court with hallucinated case citations. This has happened to lawyers at all levels, including lawyers at big law firms. If you ask an LLM questions about very basic principles of law, it will do a decent job, but anything requiring nuance or specialization, they will spit out things that sound plausible, but are not correct.

And this is not likely to change in the future, as the legal market is so small and niche that the leading makers of LLMs have put legal analysis near the bottom of their list of priorities in terms of improving model performance. There is very little if any effort by the major LLM companies to curate sources of additional high-quality legal training data or fine-tune their models to improve performance on legal tasks. Law is also a field with very low tolerance for error, where tiny mistakes can have big consequences, and getting the models to perform well under these constraints would require a lot of investment without a sufficient payoff.

The true reason big firms are buying Harvey and Legora subscriptions is simply to use an LLM for LLM-type tasks, like document review, spotting issues in user-provided documents, and other things that LLMs do well. True, services like Harvey and Legora have lots of cool templates and features for legal work, but you will find that most of the people who use these services in these firms use them much the same way they'd use ChatGPT, Claude, or any other AI chatbot.

The reason law firms can't just use ChatGPT or Claude is that they can't allow confidential or privileged client data (such as documents provided through the prompts) to be stored and hosted on a third party service like ChatGPT or Claude, as these companies may have to turn over client data in response to subpoenas, and depending on the type of LLM account you have, these companies could use your user prompts to train future models thus risking leakage of client data to third parties and potential privilege waiver.

Services like Harvey and Legora solve this problem by accessing the LLMs through APIs, and all client data, prompts/responses, etc., are stored encrypted on Harvey or Legora servers and protected by keys held by the customer. For many law firms, this is 95% of what they're paying for.

The big challenge "Mike" presents to services like Harvey and Legora is that it exposes how little additional value they offer over ChatGPT or Claude, for the vast majority of law firms. A system like "Mike" can provide the same security benefit at basically $0 cost, and can be hosted on the law firm's own internal servers. This is going to put a lot of pricing pressure on services like Harvey and Legora; law firms are notoriously cheap when it comes to IT and software spend and will switch quickly if cheaper alternatives arise. This confirms that Harvey and Legora are going to have to sell their services based on the value they add to lawyer productivity, and not just on being a protected wrapper around GPT or Claude.

xrd10d ago

IANAL, obviously!

One question I have about legal AI startups/products, is how do they maintain or improve upon billing practices of law firms?

Having worked with a bunch of lawyers, I know that I'm often paying $500/hr to that firm. That work is actually done by a paralegal who is being paid $40/hr, and then I'm being billed through the partner for an extra $460/hr. This is a gross oversimplification, but you get the point.

If the partner needs to bring in $5M a year, how does any addition of tech solve that?

If I'm the customer of the law firm, I would love to have a more cost efficient way to get legal advice. But, I don't understand how those incentives are matched by the partner? I don't really think they want a more efficient result for their customers, they want a better way to get more billable hours. Adding "tech efficiency solutions" does not solve that issue at all.

Inevitably, customers will use LLMs on their own, and as people have noted, lose attorney client privilege (and often get hallucinated bad advice). There will probably be some very comical court room dramas when people try to represent themselves with an LLM on their shoulder.

Am I misunderstanding something fundamental about the legal world that will make a major law firm adopt this tech? I feel like there are some strong reasons they will universally avoid moving in this direction. Long term it will win and there will be blood on the floor, but why would any large firm adopt this stuff right now?

lawtalkinghuman10d ago

The answer is: the market will work it out eventually. Clients will push for more work to be fixed-fee/outcome-based rather than billed hourly. There'll be some small firms who'll successfully grab lots of lower value clients who are willing to use digital tools to handle their work and don't particularly care about having a big fancy office in London or New York if it means lower bills (and they can then basically use the relationship they've had providing the supervised online service to be the first point of call when said client wants something that's less off-the-shelf and needs more work).

Also, an interesting example: in English litigation (where, broadly, loser pays unlike America where each side pays), maximising billable hours is not always a viable strategy for anybody if those costs aren't recoverable on success. Someone involved in large-scale commercial litigation involving disclosure of millions of documents who doesn't use algorithmic document classification (now pretty broadly accepted as normal) potentially runs the risk of a judge determining that the costs of going through all the documents by hand isn't recoverable. Insurers/litigation funders aren't going to want to risk padding the costs so much that the judge prevents them from recovering their stake in the litigation.

Customers using their own LLMs: yep, they might do that. I think the pitch from the legal LLM providers is "we've got legally trained people doing RLHF to make it more accurate" mixed in with "also we've got a partnership with Lexis/Westlaw/etc. so we can do legal research that's better than what's on the open web", with a little bit of "if you get sued for professional negligence, 'I used the legal AI thing that's built into Westlaw' is gonna be more convincing to a judge and jury (and your insurance company) than 'I used ChatGPT, yes, like the app you've got on your phone'...".

yeehawtypebeat10d ago

It really depends on the firm and what work they do. The firm I work for, we do not bill hours. We take a percentage of the recovered funds. It's high volume and many tasks are repetitive.

We don't have paralegals/attorneys handle cases from beginning to end. We have different positions handle different tasks. One person may only do scheduling, another does discovery, another handles reviewing releases.

For us, adopting tech to make us more efficient is a priority. Our setup is a bit unique, but I can see PI and collection firms adopting tech similar to this.

piokoch10d ago

My guess is that capable lawyer who will be able to spot hallucination and figure out key stuff missed by AI will be satisfied to earn $500K a year, not $5M, so he will charge less. Those who charge $500/hr will simply extinct or will be a luxury used by very rich people for not particular reason, like the ones who buy other overpriced goods.

xrd10d ago

I'm not saying a partner earns $5M, but that he is responsible for bringing that into the firm.

I just don't understand how decision makers at a big firm are going to say yes to tech solutions when those solutions will kill the goose roaming their hunting grounds.

htrp10d ago

just a gentle reminder that most lawyers bill about 2200 to 2300 hours a year (at the top tier). Even at the crazy tier (no-life all work), lawyers generally don't exceed 3000 hours a year

jcfrei10d ago

I believe this is the direction enterprise software is generally going. An open-source base with a very permissive license that then each company can adapt (with claude, codex, etc.) for it's own needs. It's either running it on it's own infrastructure or in hosted environment by the author. I've built a similarly extensible codebase for an ERP: https://github.com/lambdadevelopment/lambda-erp

somewhatgoated10d ago

How will developers of this software get paid in this model?

ford10d ago

I think a more realistic model is not fully open source, but apps with extremely open/flexible APIs and data models that allow arbitrary front-ends (likely with a default one provided by whoever provides the API). Kind of like Stripe's model, but the audience of "developers" is bigger since anyone can be a "developer" with Claude Code

Or maybe it will be the more established open source model where the code is free but the maintainers offer hosting/some default product

jcfrei10d ago

good question - some thoughts I had: hosting the model and maybe some review process. for example: you have the customer's employees telling llms about new features and then a dedicated review cycle on the hosting side makes sure it doesnt break anything and is secure, etc.

ford10d ago

I'm really interested in how LLMs will enable more customizable, personal software. Our PMs & Designers are writing a lot of code now, and our engineers are spending time figuring out how to make a system that's easy for PMs & designers to extend/add to.

It's not a big leap to apply that model to a company and its customers, where the company builds a well-abstracted, easily extensible base that 1) Customers can easily extend/customize for their workflows 2) Customers can self-host or run fully isolated, much easier (probably not quite there yet, but is a possible world)

bluefirebrand10d ago

> Our PMs & Designers are writing a lot of code now, and our engineers are spending time figuring out how to make a system that's easy for PMs & designers to extend/add to

Sounds like your developers are relegating themselves to being review monkeys instead of developers

ford10d ago

In a post Claude Code world that's the job of engineers - the engineering is designing good abstractions, scalable systems, and things that are easy to contribute to. This is what the highest leverage senior engineers have always done, the audience has just changed

Engineering has moved up another layer of abstraction (just like we moved past managing buffers & writing machine code)

meekaaku9d ago

This looks great. The demo is very fast. Is it static generated or is it reading the sql db?

jcfrei7d ago

Thanks. The demo itself is prerecorded - so no LLM calls. It's just replaying from the sql db. In an actual request it isn't much slower though in finding what tools to use, etc. You can test it btw. I've allocated an hourly budget - anything that you query after the demo goes to an actual LLM.

fittingopposite10d ago

This is AGPL.

kavalg9d ago

How come? the github page says the license is MIT.

reverius4210d ago

Presumably this is an issue for the commercial competitors too, but in light of the recent court ruling in United States v. Heppner that AI chatbots can break attorney-client privilege and/or work product doctrine, what kinds of things can this be safely used for? (I would assume you want to avoid sending anything with client-confidential information in it to a service provider like OpenAI or Anthropic.)

Potentially if used with a local LLM and not a service provider, this might protect attorney-client privilege?

victorbjorklund10d ago

It’s not different from googling. If a non-lawyer googles legal advice (”how to give yourself an alibi after murdering someone”) it will not be protected by attorney-client privilege. Same if you ask OpenAI.

VoidCoefficient8d ago

There is a big difference between using something like Google search and an AI chatbot, in terms of the risk of privilege waiver.

In Google you're generally entering fairly generic and short search queries. The example you provided ("how to give yourself an alibi after murdering someone"), is generic and could apply to anyone or could have been entered for other purposes such as writing a crime novel.

With ChatGPT and Claude, the risk is much higher, because you're basically uploading entire documents with potentially privileged material to a third party, as part of your prompt. To use your analogy, instead of entering a generic "how to give yourself an alibi" query, you'd be providing privileged interview notes and other attorney work product as part of your prompt to the LLM. In the Heppner case (which actually involved a client and not a lawyer), detailed reports and discussions of potential strategies were uploaded.

llagerlof10d ago

This. I am telling this since the boom of generative AI and promptly being ignored.

alansaber10d ago

You're right but lawyers are naturally looking for precedent to support this

mettamage10d ago

Some people pay attention. I know I do. Thanks for mentioning it.

robertritz10d ago

United States v. Heppner mentioned a public chatbot service. If a law firm (or specialized provider) offered a chatbot using their own servers and hosted the traces and other data on the law firms own servers it would almost certainly be protected. But another case would need to happen to determine that.

But that only applies for clients using the chatbot. If a lawyer is using the LLM it is definitely protected. No different if a lawyer searches something on Google or Lexis Nexis. The search itself is protected. I guess you could debate metadata but the content surely is protected.

debarshri10d ago

you can have dedicated deployment per customer per case, segregating it logically. I have seen this happen in larger law firms. It could be based on groups, teams, partners etc.

nlh10d ago

We're in this weird in-between phase of the tech world where projects like this can now be put together in a few hours/days, but the audience of us HN folk are still trained on the idea that this is the result of months or years of work.

We're going to have to re-train ourselves on what hard work looks like (and thus what should be upvoted here).

I don't know whether the project's creator (@willchen96?) is a lawyer, or if they work at a law firm that helped them shape this, or how much time and effort they put into this, or whether law firms even want or need a vibe-coded open source project for their legal AI stack, but we should be considering the totality of those things when looking at new projects these days.

There's a lot of red flags here.

solarkraft10d ago

Your comment (maybe accidentally) encodes the notion that hard work is the thing to appreciate.

I don’t actually care that much about the work having been hard - I care about the result being good.

nlh10d ago

Totally fair point!

kostarelo10d ago

For a moment I thought it was some open-source LLM trained on legal. It's not, it's a web app wrapping major LLM providers and streamlining legal workflows, uploading documents, and having the LLM providers interact with them.

Cool project regardless!

dahcryn10d ago

yeah I thought that was the USP of Legora and Harvey, so this is not the same thing at all, just surfing the brand recognition

alansaber10d ago

Harvey made it a point to FT ChatGPT models for a year or so but they were struggling to keep up with the pace of new model deployments and quit. They never went as far as Cursor AFAIK which produced its own routers/"composer" models.

erispoe10d ago

Harvey doesn't have finetuned model anymore do they?

paultopia10d ago

I'm a little puzzled by what this actually is supposed to be. The marketing material on this website suggests that it's meant to be used with a firm's Gemini or Claude API keys. ("A chat interface that reads your documents, cites verbatim, runs multi-step workflows, and drafts and edits contracts end-to-end. Plug in your own Claude or Gemini keys, and keep full control of the models you use.").

If that's true, how does it actually achieve anything with respect to client confidentiality or anything else? (For example, there's the claim "the assistant keeps full context across every conversation and every document." --- but isn't that a function of the model one uses, which is on Anthropic or Google? Ditto the claim "Documents never leave your perimeter. Compliance, residency, and privilege stay under your control." But this is only true if you're not piping them to Anthropic or Google...) Is this just a user interface?

It would be nice if these product webpages included an easy way to find documentation so that one could figure out what the product actually does. I can't find any obvious way to discern if it can be easily used with a local model running via ollama or something, for e.g.

ford10d ago

The "open source" part is the wrapper on top (up to you if you believe that's meaningful here)

piker10d ago

These firms have enterprise relationships that dictate all of that. This is presumably just a frontend that takes the key as an input and plugs into that infrastructure.

kernalix710d ago

Self-hostable legal AI as open source is a useful direction in principle. Hard to tell how mature the actual implementation is though, the repo is pretty fresh and the marketing site is doing a lot of heavy lifting compared to what's in the code right now. Will be more interesting to revisit in a few weeks.

0xbadcafebee10d ago

Rule of tech products: the nicer the splash page is, the worse the product is

superfrank10d ago

Apple would like a word...

MrDrMcCoy10d ago

If we restrict the rule to just software in Apple's case, I think it still holds :P

1 more reply

sandreas10d ago

Cool project. What a pity it's not mikefoss.com, would match the soundex of Mike Ross from suits even better ;-)

buggy625710d ago

Especially since it’s “a competitor to Harvey”

embedding-shape10d ago

Maybe it's just me, but seems strange to not include what countries legal system this is for, and having it prominently front and center?

Since this is HN, I guess it's fair to assume it's for the US, but since English is used in more countries than the US, wouldn't it be a good idea to say outright what countries legal systems this actually understands and supports? Or is it maybe meant to be country-agnostic somehow? If so, that isn't very clear either.

syntaxing10d ago

I always wondered if Justin Kan’s Atrium closed door prematurely by just 2-3 years. It would have been cool to see a “technology” driven law firm and how it would have adjusted to LLMs.

alansaber10d ago

There are loads of them now. Great for trivial work. Not so great to highly templatise more complex matters.

scosman10d ago

2 commits, 8 hours old....

georgespencer10d ago

OP's Github profile looks very fishy.

albertgoeswoof10d ago

And yet 130 stars

m4rkuskk10d ago

No way they got that many stars in that little time. buy.fans must run a special right now.

KingOfCoders10d ago

Not saying they did, but buying a 100 starts is cheap.

piker10d ago

The post exploded on LinkedIn and the repo is likely being starred by hundreds of vibe coders. It’s legit, but may have a lower signal value.

dalemhurley10d ago

Amazing work, 130 stars is quite high for a niche product within hours!

re_spond10d ago

Cool initiative. Is this fully separate from "legal Mike", the Dutch company that provides a similar solution, https://legalmike.ai/product/ ?

That may be confusing on the naming.

iot_devs10d ago

I thought it was named after the characters of Suits: Harvey and Mike

1 more reply

wps10d ago

This website is actually gorgeous. What do you call this style?

NamlchakKhandro10d ago

It's called "We just discovered Claude Code and so we think Anthropic is Amazing so everything they do is godlike and thus their design choices must also be god like. Apple is Dead, Long Live Anthropic" style.

anon37383910d ago

Hm, I don't think this looks like Anthropic's design style. Anthropic is kind of doing a Chobanicore + Corporate Memphis design system that I personally find kind of creepy. But the website here just feels fresh and pleasant.

rib3ye10d ago

I think you mean this

https://github.com/anthropics/claude-code/tree/main/plugins/...

rvz10d ago

> Apple is Dead, Long Live Anthropic" style.

Except that the font that it is using is EB Garamond and Apple was heavily using the Garamond font in the mid-1980s to 2000s.

Given that almost everyone is copying both, it is now garbage.

anon37383910d ago

Agreed; that's a beautiful site. The main design style apart from minimalism that I notice is glassmorphism. Well, that and a very well chosen Monet to set the tone.

elicash10d ago

Do you just mean the Monet at the top? I know little about art, but I assume impressionism.

That, plus an Anthropic-like logo.

trilogic10d ago

Why don´t you put a direct link that redirect users to some proprietary AI providers instead of making it look fancy. (If I ask whatever AI model will produce same outputs/forms, structured as you wish, and even locally). To qualify as some wrapper you need to add a layer of creativity by you on top of the existing ones.

campers10d ago

Interested to try it out! Some feedback on the homepage there's nothing above the fold, or directly below that says its a Legal AI platform. I would like a legal AI tool, but I'm not familiar with the space don't know what Harvey or Legora are. It was only the hackernews title "Mike: open-source legal AI" that gave the context.

PeterStuer10d ago

Looks like a prototype thrown together in a weekend hackathon. At fist glance, flimsy wrapper around a model with a chat interface, a few prompts, and some very basic RAG. What did I miss?

jchurch110d ago

This is complete vibe garbage.

go look at the auth - it's a call to supabase.

go look at the migrations - it's like 5 tables.

There is a real need in the space and a real opportunity for a solution like this but this is a complete nothing burger of what exists in the underlying code.

The requirements for this kind of product are extensive and complex. The shape of the data layer is complex and nuanced. Absolutely none of this is considered or implemented in the project but it sure is blowing up.

williamtrask10d ago

Sometimes I find joy in noticing the importance of comma placement:

Everything the incumbents ship, in an open codebase your firm owns.

Everything the incumbents ship in an open codebase, your firm owns.

AntiUSAbah10d ago

I find the name and presentation well chosen. For whatever reason "Mike" fits well in this legal context.

codechicago27710d ago

Mike is Harvey’s genius mentee in “Suits”

latortuga10d ago

I had a chuckle at the domain name, mikeoss.com

AntiUSAbah10d ago

Aaahhh perhaps Legora broke my connection with suits :)

gigatree10d ago

Besides the Suits connection?

oliwary10d ago

The name is really clever given that the character in Suits is called Mike Ross. :)

albertgoeswoof10d ago

How does this work with docx files? The screenshots only show pdfs?

timdim10d ago

LibreOffice for DOC/DOCX to PDF conversion

albertgoeswoof10d ago

how does the agent edit the docx files then? or does it convert all docx to pdf, parse the PDF into context, make edits and then save it back to docx?

laywers live in docx not pdf

typeofhuman10d ago

Behold the continued tradition of AI products having logos that look like buttholes.

higginsniggins10d ago

Beautiful website.

DetroitThrow10d ago

This doesn't have most of the important features of Harvey or Legora. This is just a vibe coded project that escaped showlim. Urgh.

kleiba210d ago

I'm so tired of having to sign up to some new service even just to try it out.

robertritz10d ago

So open up your new product to every random agent and griefer on the internet? Why would you do that?

kleiba210d ago

No, I mean just to try it out.

alansaber10d ago

There are guest accounts, you know.

kleiba210d ago

Where are they?

ebipaul519410d ago

Is it safe to share details with AI for case points what happened when data is breached. Victims name will be reviled right?

voidUpdate10d ago

Because using LLMs for legal work has never gone wrong, and LLMs have never cited completely hallucinated cases

jbstack10d ago

Humans make mistakes too. In many cases, more often than LLMs. Humans are still useful for doing work.

I use AI extensively in my legal work. But I check every citation myself, manually. That means that I read the entirety of every case that I plan to cite in my output, and I check on Westlaw that it hasn't been overridden by a later decision. If you're just producing the AI's output verbatim, then you have only yourself to blame when things go wrong in the courtroom.

voidUpdate10d ago

Maybe you check every citation, but there have been several news stories recently of lawyers using LLMs and using completely fabricated cases without verifying they exist

jbstack10d ago

Agreed, but that's the fault of the lawyer, not the LLM.

j / k navigate · click thread line to collapse

100 comments

jbstack10d ago

hn_throwaway_9910d ago

jbstack10d ago

A typical workflow for a complex piece of legal research might look like this:

1. You need to research a legal topic.

2. Do some Googling, or chat to your LLM, to get a rough overview and some pointers for further research (but don't completely rely on what you find).

3. Read some professional content (e.g. Practical Law articles relevant to the topic, or a legal textbook).

4. Read the relevant legislation.

5. Use a legal database to download all the cases you found from steps 2 and 3 which seem like they might be relevant.

6. Use a legal database to download all the cases which cite the relevant legislative provisions you found in step 4 and seem like they might be relevant.

7. Use the legal database to confirm that those cases are still good law (not overridden or criticised by a later case).

8. Skim read them, discard those that turned out to obviously not be relevant.

9. Read the remaining ones more closely.

10. Note any useful-looking cases which are cited in the ones from step 9, and recursively work your way through those cases as well.

Relying on court-provided copies of judgments won't realistically help you with most of these steps.

hn_throwaway_9910d ago

Related question, then - what do judges use when they have to write opinions in the first place? Do they have to follow the same process and use Thomson Reuters?

2 more replies

ericmcer10d ago

those steps are what a RAG LLM agent setup excels at.

If tech companies invested 10% of what they have in AI assisted coding tools into AI assisted legal tools, they would be able to do those steps easily.

It is definitely coming.

1 more reply

whattheheckheck9d ago

This all needs to be publicly accessible for free. Gonna see how blatently inconsistent laws and interpretations are

1 more reply

eagleal10d ago

Do not know about the US, but some countries publish some type of high profile cases but _only_ after anonimization for obvious privacy reasons.

They can also build on the Data professionals themselves submit.

hn_throwaway_997d ago

> Do not know about the US, but some countries publish some type of high profile cases but _only_ after anonimization for obvious privacy reasons.

jonners0010d ago

This is great on this topic: https://www.thebignewsletter.com/p/gatekeepers-of-law-inside...

r0fl10d ago

In Canada there is a database of a lot of case law here: https://www.canlii.org/

Theoretically speaking if someone scraped all of it and added it to something like this open source Mike project would that then be a much better tool for lawyers?

jbstack8d ago

> would that then be a much better tool

Better than before, yes. Good for general legal work that doesn't require robust legal research, yes. Sufficient for full legal research, no.

ferngodfather8d ago

We have bailii.org but it doesn't have everything.

dspillett10d ago

--------

[0] Or at all… But even if I wasn't someone actively avoiding LLMs, the point would still stand

daydream9d ago

https://case.law/

htrp10d ago

What does legora, harvey, or crosby add here other than the default westlaw/TR/lexis integrations?

david_shi10d ago

I'd imagine it's like using Cursor/Claude Code vs. a Jetbrains IDE plugin.

VoidCoefficient8d ago

xrd10d ago

IANAL, obviously!

One question I have about legal AI startups/products, is how do they maintain or improve upon billing practices of law firms?

If the partner needs to bring in $5M a year, how does any addition of tech solve that?

lawtalkinghuman10d ago

yeehawtypebeat10d ago

It really depends on the firm and what work they do. The firm I work for, we do not bill hours. We take a percentage of the recovered funds. It's high volume and many tasks are repetitive.

For us, adopting tech to make us more efficient is a priority. Our setup is a bit unique, but I can see PI and collection firms adopting tech similar to this.

piokoch10d ago

xrd10d ago

I'm not saying a partner earns $5M, but that he is responsible for bringing that into the firm.

I just don't understand how decision makers at a big firm are going to say yes to tech solutions when those solutions will kill the goose roaming their hunting grounds.

htrp10d ago

just a gentle reminder that most lawyers bill about 2200 to 2300 hours a year (at the top tier). Even at the crazy tier (no-life all work), lawyers generally don't exceed 3000 hours a year

jcfrei10d ago

somewhatgoated10d ago

How will developers of this software get paid in this model?

ford10d ago

Or maybe it will be the more established open source model where the code is free but the maintainers offer hosting/some default product

jcfrei10d ago

ford10d ago

bluefirebrand10d ago

> Our PMs & Designers are writing a lot of code now, and our engineers are spending time figuring out how to make a system that's easy for PMs & designers to extend/add to

Sounds like your developers are relegating themselves to being review monkeys instead of developers

ford10d ago

Engineering has moved up another layer of abstraction (just like we moved past managing buffers & writing machine code)

meekaaku9d ago

This looks great. The demo is very fast. Is it static generated or is it reading the sql db?

jcfrei7d ago

fittingopposite10d ago

This is AGPL.

kavalg9d ago

How come? the github page says the license is MIT.

reverius4210d ago

Potentially if used with a local LLM and not a service provider, this might protect attorney-client privilege?

victorbjorklund10d ago

VoidCoefficient8d ago

There is a big difference between using something like Google search and an AI chatbot, in terms of the risk of privilege waiver.

llagerlof10d ago

This. I am telling this since the boom of generative AI and promptly being ignored.

alansaber10d ago

You're right but lawyers are naturally looking for precedent to support this

mettamage10d ago

Some people pay attention. I know I do. Thanks for mentioning it.

robertritz10d ago

debarshri10d ago

you can have dedicated deployment per customer per case, segregating it logically. I have seen this happen in larger law firms. It could be based on groups, teams, partners etc.

nlh10d ago

We're going to have to re-train ourselves on what hard work looks like (and thus what should be upvoted here).

There's a lot of red flags here.

solarkraft10d ago

Your comment (maybe accidentally) encodes the notion that hard work is the thing to appreciate.

I don’t actually care that much about the work having been hard - I care about the result being good.

nlh10d ago

Totally fair point!

kostarelo10d ago

Cool project regardless!

dahcryn10d ago

yeah I thought that was the USP of Legora and Harvey, so this is not the same thing at all, just surfing the brand recognition

alansaber10d ago

erispoe10d ago

Harvey doesn't have finetuned model anymore do they?

paultopia10d ago

ford10d ago

The "open source" part is the wrapper on top (up to you if you believe that's meaningful here)

piker10d ago

These firms have enterprise relationships that dictate all of that. This is presumably just a frontend that takes the key as an input and plugs into that infrastructure.

kernalix710d ago

0xbadcafebee10d ago

Rule of tech products: the nicer the splash page is, the worse the product is

superfrank10d ago

Apple would like a word...

MrDrMcCoy10d ago

If we restrict the rule to just software in Apple's case, I think it still holds :P

1 more reply

sandreas10d ago

Cool project. What a pity it's not mikefoss.com, would match the soundex of Mike Ross from suits even better ;-)

buggy625710d ago

Especially since it’s “a competitor to Harvey”

embedding-shape10d ago

Maybe it's just me, but seems strange to not include what countries legal system this is for, and having it prominently front and center?

syntaxing10d ago

I always wondered if Justin Kan’s Atrium closed door prematurely by just 2-3 years. It would have been cool to see a “technology” driven law firm and how it would have adjusted to LLMs.

alansaber10d ago

There are loads of them now. Great for trivial work. Not so great to highly templatise more complex matters.

scosman10d ago

2 commits, 8 hours old....

georgespencer10d ago

OP's Github profile looks very fishy.

albertgoeswoof10d ago

And yet 130 stars

m4rkuskk10d ago

No way they got that many stars in that little time. buy.fans must run a special right now.

KingOfCoders10d ago

Not saying they did, but buying a 100 starts is cheap.

piker10d ago

The post exploded on LinkedIn and the repo is likely being starred by hundreds of vibe coders. It’s legit, but may have a lower signal value.

dalemhurley10d ago

Amazing work, 130 stars is quite high for a niche product within hours!

re_spond10d ago

Cool initiative. Is this fully separate from "legal Mike", the Dutch company that provides a similar solution, https://legalmike.ai/product/ ?

That may be confusing on the naming.

iot_devs10d ago

I thought it was named after the characters of Suits: Harvey and Mike

1 more reply

wps10d ago

This website is actually gorgeous. What do you call this style?

NamlchakKhandro10d ago

anon37383910d ago

rib3ye10d ago

I think you mean this

https://github.com/anthropics/claude-code/tree/main/plugins/...

rvz10d ago

> Apple is Dead, Long Live Anthropic" style.

Except that the font that it is using is EB Garamond and Apple was heavily using the Garamond font in the mid-1980s to 2000s.

Given that almost everyone is copying both, it is now garbage.

anon37383910d ago

Agreed; that's a beautiful site. The main design style apart from minimalism that I notice is glassmorphism. Well, that and a very well chosen Monet to set the tone.

elicash10d ago

Do you just mean the Monet at the top? I know little about art, but I assume impressionism.

That, plus an Anthropic-like logo.

trilogic10d ago

campers10d ago

PeterStuer10d ago

Looks like a prototype thrown together in a weekend hackathon. At fist glance, flimsy wrapper around a model with a chat interface, a few prompts, and some very basic RAG. What did I miss?

jchurch110d ago

This is complete vibe garbage.

go look at the auth - it's a call to supabase.

go look at the migrations - it's like 5 tables.

There is a real need in the space and a real opportunity for a solution like this but this is a complete nothing burger of what exists in the underlying code.

williamtrask10d ago

Sometimes I find joy in noticing the importance of comma placement:

Everything the incumbents ship, in an open codebase your firm owns.

Everything the incumbents ship in an open codebase, your firm owns.

AntiUSAbah10d ago

I find the name and presentation well chosen. For whatever reason "Mike" fits well in this legal context.

codechicago27710d ago

Mike is Harvey’s genius mentee in “Suits”

latortuga10d ago

I had a chuckle at the domain name, mikeoss.com

AntiUSAbah10d ago

Aaahhh perhaps Legora broke my connection with suits :)

gigatree10d ago

Besides the Suits connection?

oliwary10d ago

The name is really clever given that the character in Suits is called Mike Ross. :)

albertgoeswoof10d ago

How does this work with docx files? The screenshots only show pdfs?

timdim10d ago

LibreOffice for DOC/DOCX to PDF conversion

albertgoeswoof10d ago

how does the agent edit the docx files then? or does it convert all docx to pdf, parse the PDF into context, make edits and then save it back to docx?

laywers live in docx not pdf

typeofhuman10d ago

Behold the continued tradition of AI products having logos that look like buttholes.

higginsniggins10d ago

Beautiful website.

DetroitThrow10d ago

This doesn't have most of the important features of Harvey or Legora. This is just a vibe coded project that escaped showlim. Urgh.

kleiba210d ago

I'm so tired of having to sign up to some new service even just to try it out.

robertritz10d ago

So open up your new product to every random agent and griefer on the internet? Why would you do that?

kleiba210d ago

No, I mean just to try it out.

alansaber10d ago

There are guest accounts, you know.

kleiba210d ago

Where are they?

ebipaul519410d ago

Is it safe to share details with AI for case points what happened when data is breached. Victims name will be reviled right?

voidUpdate10d ago

Because using LLMs for legal work has never gone wrong, and LLMs have never cited completely hallucinated cases

jbstack10d ago

Humans make mistakes too. In many cases, more often than LLMs. Humans are still useful for doing work.

voidUpdate10d ago

Maybe you check every citation, but there have been several news stories recently of lawyers using LLMs and using completely fabricated cases without verifying they exist

jbstack10d ago

Agreed, but that's the fault of the lawyer, not the LLM.

j / k navigate · click thread line to collapse