Copy ~/.mozilla/firefox/<profile_name>/*.sqlite into another directory and start playing with those files. It should be easy to come up with things you want to find out. Start with some basic single-table ones like 'what are my 10 most visited pages'. Then expand to use multiple tables ('what are my 10 most visited pages that I have not bookmarked').
Once you have an idea of how queries work, think about the database design and the decisions that led to it -- why did the developers choose to relate tables to each other that way? What kinds of queries are the indexes there to optimize for?
Once you've done this, rinse and repeat with a variety of programs so you can get multiple perspectives. http://www.sqlite.org/famous.html or google around for others.
If you want a more academic treatment, I would recommend C. J. Date's _An Introduction to Database Systems_.
Also, it's been my experience that books which claim to cover relational databases in general, but which actually assume MySQL throughout, are usually pretty bad.
Be warned that he's got an axe to grind - He's often critical of SQL because it doesn't live up to the mathematical elegance of the relational model. (I happen to agree with him.) He knows his stuff, though.
The "where" is on hard drives. Don't let anyone tell you that a purely memory resident data store is a database, it's not. It's a cache. If you aren't writing to disk you are not a persistent database. Think memcached and redis. Although I would say the later is more of a database because you can configure it to write on every update.
The "how" is where all the devilish details are. How do you put data into and get data out of a database. How does the database store your data on disk. Well, there are two main schools of thought in the market place at the moment. Relational[0] and non-relational. Relational is the older more mature, more well supported, more well understood, more "standards" compliant of the two. Nowadays, non-relational is championed by the NoSQL[1] movement which eschews the relational model for a looser concept, less defined, less "compliant", less understood model. You will find many more and varied answers to "how" in the non-relational space precisely because the renaissance in non-relational is so fresh (the concept itself has been around a long time). Nevertheless, NoSQL has been attracting considerable attention over the last year or so in the open source community.
Every developer worth his salt will know or at least be familiar with the relational model and at least one or two of it's implementations. MySQL, PostgreSQL and SQLite are just three of the most popular open source RDBMS's (relational database management system)[2] around today. They all have varying degrees of support for the primary language used to speak to RDBMS's, SQL[3].
There are many open source, non-relational implementations in production today. HBase, Cassandra, Riak, Voldemort and MongoDB are all playing in the NoSQL field. Popular systems within NoSQL draw their lineage from either the Google approach (gfs[4], big table[5]) or the Amazon approach (dynamo[6]). Yet another is Neo4j, a graph database[7] which, at the moment, is lumped under the "NoSQL" moniker but is a separate beast entirely. Just like their relational cousins, the way NoSQL solutions do what they do and what they are best suited for vary.
As with most interesting discussions, which approach is best is left as an exercise for the reader. I'll just throw my two cents in for good measure. First I would say that you need to know SQL. Knowing SQL and not a specific system (like mysql or postgresql) will allow you to move in between relational systems and let you pick the right one for the job. After taking a look and familiarizing yourself with the SQL world, I would take a very serious look at what is going on with NoSQL. Best I can tell, there is a lot of mindshare and "cool" factor foisted upon this space right now because of the scale non-relational concepts can bring to enterprise users. Be sure, though, that rdbms is the predominant player in the database world. Every company large and small use some form of rdbms. Newer more forward looking companies that have large and very large scale needs to store and analyze vast quantities of unstructured or loosely structured data are looking more and more to non-relational NoSQL solutions.
I specifically avoid linking to particular implementations but rather link to broader concepts here. It is the concept that is more important than the implementation when learning about the big and confusing world of databases.
[0] http://en.wikipedia.org/wiki/Relational_model [1] http://en.wikipedia.org/wiki/NoSQL [2] http://en.wikipedia.org/wiki/Relational_database_management_... [3] http://en.wikipedia.org/wiki/SQL [4] http://en.wikipedia.org/wiki/Google_File_System [5] http://en.wikipedia.org/wiki/Big_table [6] http://en.wikipedia.org/wiki/Dynamo_(storage_system) [7] http://en.wikipedia.org/wiki/Graph_database
First, databases are not "just" about persistence, and I don't mean that in a pedantic way. The point of a database management system is to manage your data over time and allow you to make use of it. This includes things like stating constraints to ensure that the data in the database is always correct and providing methods for querying your data at varying levels of complexity.
Second, there are a number of in-memory databases. Oracle sells one. SQLite is often used as one. The fact that they are in-memory does not make them not databases.
Third, relational versus non-relational is not really an argument about "how the data is stored on disk" or about being "standards compliant." A database traditionally consists primarily of three things: a data model, a query language, and an implementation. A data model, like "everything is a table with column headers and rows" can be written to disk in many different ways and queried in many different ways. (It's an abstraction.) Indexing structures and data formats may be mostly the same even when the data model is different. The (current) "relational vs non-relational" debate, such as it is, is usually about whether it makes sense to change the data model and query language in order to ensure that the implementation is scalable on "cloud-like" platforms with specific data access needs. (That said, there are many different types of non-relational models and query languages designed for different purposes.)
Fourth, there isn't one SQL. There are in fact a number of standards, SQL-86, SQL-89, SQL-92, and so on. It makes sense to learn the standard first (at least up to 92 or so), rather than learning particular implementation specifics. Most databases implement SQL-92 or above, though there is somewhat wide variance in additional features like indexing structures and functions and in data modeling languages.
...is like saying "the only thing you really need to know about sex is that it is for procreation." :-)
Sorry - couldn't resist. Utilizing databases effectively is often an area that is overlooked or kind of an afterthought. In my experience, often issues of scale stem from poor database design.
I put this in a separate comment, but any of Tom Kyte's books are excellent for gaining an understanding of how to use databases effectively. Although they are Oracle focused, there is still a lot to be learned about practical database design and utilization from his books. He also has a great site called “Ask Tom” where he answers questions. I have learned a ton from simply reading questions and his responses. Again, all Oracle focused, but very valuable nonetheless.
Ask Tom Link: http://asktom.oracle.com/pls/apex/f?p=100:1:0
For books, I've heard good things about http://www.amazon.com/Fundamentals-Database-Systems-Ramez-El..., though I haven't read it myself.
I trust you'll have no shortage of recommendations about CAP, column-based stores, and support for schema flexibility in other comments.
There are three main free software SQL databases: MySQL, PostgreSQL, and SQLite. Don't use MySQL, it'll get you into a great deal of bad habits, is missing an ass-ton of useful features (the default engine doesn't even support transactions), and likes to corrupt data. It also has questionable licensing and is owned by Oracle (a commercial database vendor) which as you can imagine doesn't have a lot of incentive to improve it.
SQLite is a very robust database that is great for read-mostly sites (like blogs) and for embedding into applications (Firefox for example). Its primary limitation is that there can only be one writer at a time. It is very easy to pick up and get going with since there's no persistent daemon, you just point it at a file and start writing. I'd recommend starting here.
PostgreSQL is the more featureful database and supports hundreds of concurrent connections, locking, access control, etc. This is the real "workhorse" database you're likely to find in production at large sites that do make use of free-software RDBMS. Eventually you will need to use something more powerful than SQLite, and this is probably it.
Also, you'll need to learn SQL. Check out http://www.w3schools.com/sql/default.asp for a good primer on the subject.
Pick any SQL flavor to start, sqlite is probably the easiest.
http://www.amazon.com/Database-Design-Mere-Mortals-Hands/dp/...
If you have access to a Windows machine my two cents says get a copy of SQL Server Express. In addition to that you will want a copy of the AdventureWorks database and the SQL Server Books Online.
SQL Server Express is a solid and free (as in beer) database engine. There are processor and file size limitations, but other than that it is pretty much the same as every other version they sell.
The Books Online (available online or as an installable download) has tutorial style information on every level of the system; database engine, reporting services etc.
The AdventureWorks database is a sample database that is used in the BOL and just about every other SQL Server tutorial I have seen. It is quite extensive and will save you from having to create your own data.
I'm sure somebody here will try and pick this apart, but this is how I learned. If it means anything, I have been employed as a Databse Analyst for the past four years.
They also offer "SQL Server 2008 R2 Update for Developers Training Kit", but I have not tried it. http://www.microsoft.com/downloads/details.aspx?familyid=FFF...
SQL Server Express: http://www.microsoft.com/sqlserver/2008/en/us/R2Downloads.as...
BOL: http://www.microsoft.com/downloads/details.aspx?displaylang=...
Adventure Works site: http://sqlserversamples.codeplex.com/
If you do work with Oracle, Tom's "Ask Tom" site is excellent for learning tips and tricks, as well as for finding elegant solutions to tricky SQL problems. Again, it is oracle focused, but still very valuable. Link to Ask Tom:
2) read up on some database theory
3) check out a few popular options: sqlite, mongodb, couchdb and play around a bit more.
4) find a big dataset that interests you, load it into one or more of the databases you want to experiment with, and experiment away by trying to extract interesting data, etc. Notice how each one performs, etc.
http://www.amazon.com/Beginning-Database-Design-Novice-Profe...