If someone wants to configure an unauthenticated access from the Internet, they have to do the following extra steps:
- enable listening to the wildcard address;
- remove IP filtering for the default user;
- set up a no-password authentication;
It is possible to ignore and turn off all guardrails that the system has by default, but it needs extra efforts. However, it's possible that someone copy-pasted a wrong configuration file from somewhere without knowing what is inside, or do something like - listen to localhost, but expose ports from Docker.
A use case for direct database access exists, and is acceptable, assuming you set up a readonly user, grant access to specific tables, limit queries by complexity, and limit total usage by quotas. This is demonstrated by the following public services:
In this way, ClickHouse can be used to implement public data APIs (which is probably not what DeepSeek wanted).
ClickHouse has a wide range of security and access control restrictions: authentication methods with SSL certificates; SSH keys; even simple password-based auth allows bcrypt and short-living credentials; integration with LDAP and Kerberos; every authentication method can be limited on a network level; full Role-Based Access Control; fine-grained restrictions on query complexity and resource consumption, user quotas.
But still, according to Shodan, there are 33,000 misconfigured ClickHouse servers on the Internet: https://www.shodan.io/search?query=clickhouse This can be attributed to a high popularity of ClickHouse (it is the most widely used analytic DBMS).
When you use ClickHouse Cloud, which is a commercial cloud service based on the open-source ClickHouse database (https://clickhouse.com/cloud), it ensures the needed security measures, improving strong defaults even more: TLS, stong credentials, IP filtering; plus it allows private link, data encryption with customer keys, etc.