转载自:http://www.reddit.com/r/Database/comments/27u6dy/how_do_you_build_a_database/ciggal8

Its a great question, and deserves a long answer.

Most database servers are built in C, and store data using B-tree type constructs. In the old days there was a product called C-Isam (c library for an indexed sequential access method) which is a low level library to help C programmers write data in B-tree format. So you need to know about btrees and understand what these are.

Most databases store data separate to indexes. Lets assume a record (or row) is 800 bytes long and you write 5 rows of data to a file. If the row contains columns such as first name, last name, address etc. and you want to search for a specific record by last name, you can open the file and sequentially search through each record but this is very slow. Instead you open an index file which just contains the lastname and the position of the record in the data file. Then when you have the position you open the data file, lseek to that position and read the data. Because index data is very small it is much quicker to search through index files. Also as the index files are stored in btrees in it very quick to effectively do a quicksearch (divide and conquer) to find the record you are looking for.

So you understand for one "table" you will have a data file with the data and one (or many) index files. The first index file could be for lastname, the next could be to search by SS number etc. When the user defines their query to get some data, they decide which index file to search through. If you can find any info on C-ISAM (there used to be an open source version (or cheap commercial) called D-ISAM) you will understand this concept quite well.

Once you have stored data and have index files, using an ISAM type approach allows you to GET a record based on a value, or PUT a new record. However modern database servers all support SQL, so you need an SQL parser that translates the SQL statement into a sequence of related GETs. SQL may join 2 tables so an optimizer is also needed to decide which table to read first (normally based on number of rows in each table and indexes available) and how to relate it to the next table. SQL can INSERT data so you need to parse that into PUT statements but it can also combine multiple INSERTS into transactions so you need a transaction manager to control this, and you will need transaction logs to store wip/completed transactions.

It is possible you will need some backup/restore commands to backup your data files and index files and maybe also your transaction log files, and if you really want to go for it you could write some replication tools to read your transaction log and replicate the transactions to a backup database on a different server. Note if you want your client programs (for example an SQL UI like phpmyadmin) to reside on separate machine than your database server you will need to write a connection manager that sends the SQL requests over TCP/IP to your server, then authenticate it using some credentials, parse the request, run your GETS and send back the data to the client.

So these database servers can be a lot of work, especially for one person. But you can create simple versions of these tools one at a time. Start with how to store data and indexes, and how to retrieve data using an ISAM type interface.

There are books out there - look for older books on mysql and msql, look for anything on google re btrees and isam, look for open source C libraries that already do isam. Get a good understanding on file IO on a linux machine using C. Many commercial databases now dont even use the filesystem for their data files because of cacheing issues - they write directly to raw disk. You want to just write to files initially.

I hope this helps a little bit.

廖雪峰的解释:转载自http://www.ruanyifeng.com/blog/2014/07/database_implementation.html

如何写一个数据库How do you build a database?(转载)的更多相关文章

  1. 用c++写一个数据库

    [cpp] view plain copy 第一步:构建一个头文件(**.h) [cpp] view plain copy #include<iostream> #include<i ...

  2. 跟我一起用Symfony写一个博客网站;

    我的微信公众号感兴趣的话可以扫一下, 或者加微信号   whenDreams 第一部分:基础设置,跑起一个页面-首页 第一步: composer create-project symfony/fram ...

  3. 如何写一个能在gulp build pipe中任意更改src内容的函数

    gulp在前端自动化构建中非常好用,有非常丰富的可以直接拿来使用的plugin,完成我们日常构建工作. 但是万事没有十全十美能够完全满足自己的需求,这时我们就要自己动手写一个小的函数,用于在gulp ...

  4. 用 Python 写一个 NoSQL 数据库Python

    NoSQL 这个词在近些年正变得随处可见. 但是到底 “NoSQL” 指的是什么? 它是如何并且为什么这么有用? 在本文, 我们将会通过纯 Python (我比较喜欢叫它, “轻结构化的伪代码”) 写 ...

  5. 用javaweb写一个注册界面,并将数据保存到后台数据库(全部完成)(课堂测试)

    一.题目:WEB界面链接数据库 1.考试要求: 1登录账号:要求由6到12位字母.数字.下划线组成,只有字母可以开头:(1分) 2登录密码:要求显示“• ”或“*”表示输入位数,密码要求八位以上字母. ...

  6. 【Part1】用JS写一个Blog(node + vue + mongoDB)

    学习JS也有一段时间了,准备试着写一个博客项目,前后端分离开发,后端用node只提供数据接口,前端用vue-cli脚手架搭建,路由也由前端控制,数据异步交互用vue的一个插件vue-resourse来 ...

  7. SpringBoot写一个登陆注册功能,和期间走的坑

    文章目录 前言 1. 首先介绍项目的相关技术和工具: 2. 首先创建项目 3. 项目的结构 3.1实体类: 3.2 Mapper.xml 3.3 mapper.inteface 3.4 Service ...

  8. 写一个TODO App学习Flutter本地存储工具Moor

    写一个TODO App学习Flutter本地存储工具Moor Flutter的数据库存储, 官方文档: https://flutter.dev/docs/cookbook/persistence/sq ...

  9. Spring Security 实战干货:从零手写一个验证码登录

    1. 前言 前面关于Spring Security写了两篇文章,一篇是介绍UsernamePasswordAuthenticationFilter,另一篇是介绍 AuthenticationManag ...

随机推荐

  1. windows下adb(android调试桥)基本命令(持续更新。。。)

    前言:刚开始学习android(坚持每天1篇笔记哈^_^),比较实用的命令是adb,所以就先学习这些,主要用真机调试,模拟器用的是genymotion,所以emulator暂时不大需要哈,可以后续再补 ...

  2. Xcode6和Xcode5获取app名字

    1.在Xcode5下,获取程序名字(app name)的方法为: NSDictionary *infoDictionary = [[NSBundle mainBundle] infoDictionar ...

  3. JAVA按字节读取文件

    JAVA的IO流一直都是我比较头疼的部分(我没有系统学过JAVA,一般需要实现什么功能再去看文档). 最近遇到一个需求:一个字节一个字节地读取一个文件.网上很多方法,代码一大堆.我在这里和大家分享一个 ...

  4. noip2015 提高组day1、day2

    NOIP201505神奇的幻方   试题描述 幻方是一种很神奇的N∗N矩阵:它由数字 1,2,3,……,N∗N构成,且每行.每列及两条对角线上的数字之和都相同.    当N为奇数时,我们可以通过以下方 ...

  5. (原+转)ubuntu14中结束多个caffe进程中的某个

    转载请注明出处: http://www.cnblogs.com/darkknightzh/p/5948237.html 参考网址: http://www.2cto.com/os/201407/3215 ...

  6. C++ 语言特性的性能分析

    转载:http://www.cnblogs.com/rollenholt/archive/2012/05/07/2487244.html      大多数开发人员通常都有这个观点,即汇编语言和 C 语 ...

  7. MySQL用户管理语句001

    总的来说mysql的用户管理方法可以分为如下两种: 1.直接对mysql.user 表进行[insert | update | delete] + flush privileges 这种方式主要针对那 ...

  8. WCF不支持 ASP.NET 兼容性 解决办法

    错 误提示:无法激活服务,因为它不支持 ASP.NET 兼容性.已为此应用程序启用了 ASP.NET 兼容性.请在 web.config 中关闭 ASP.NET 兼容性模式或将 AspNetCompa ...

  9. javascript之Function函数

    在javascript里,函数是可以嵌套的. 如: function(){ funcrion square(x){ return x*x;  } return square(10); } 在javas ...

  10. 《how to design programs》第11章自然数

    这章让我明白了原来自然数的定义本来就是个递归的过程. 我们通常用枚举的方式引出自然数的定义:0,1,2,3,等等(etc).最后的等等是什么意思?唯一能把等等从描述自然数的枚举方法中去除的方法是自引用 ...