Download
[Update:March 2008]
The MyTransaction library is now available to download under the MIT Open Source License.
- Download: transactions.zip version 0.1.1
- Original [0.1.0] 19.03.08: Uploaded.
- Change [0.1.1] 19.03.08: Added one little class I forgot... oops!
- View the documentation online
- Full running example
If you use the library, please let me know how you get on...
Background
Everyone who's ever built a MySQL database app has made a choice about which database engine to use. You may not realise it of course but you have.
In short the difference is; if you need ACID transactions, foreign key-constraints and row-level locking use InnoDB, otherwise use MyISAMs.
There are also a couple of more subtle differences which can bite you later on like COUNTS. SELECT COUNT(*) FROM big_table; will return in a milisecond from a MyISAM but the same query requires real work from the InnoDB engine.
Like many people I started with MyISAM; it's fast, it uses less disk, memory and effort. I never missed the foreign key constraints because Spirit's aesthetics engine can deal with that for us and I've not yet found we really need row-level locking. The one thing though that would be really handy to have in MyISAM would be transactions.
What're transactions and why would you want them?
A transaction is simply a way of telling your database "All or nothing at all". It's about database integrity.
Take for example the following sequence of commands, following Johnny paying Billy £5:
- Take £5 from Jonny's account
- Add £5 to Billy's account
- Record a £5 transfer in the transfer ledger
At the end of the week you can look at Jonny's starting balance, his current balance and all the items in the ledger which mention him and you know that all is right in the world.
Now it doesn't matter what order you do this in, if the process stops half way through your database is going to be a mess, your totals at the end of the week will be all wrong and you won't have a clue why.
Transactions are part of the answer to the problem because although you issue your commands as normal, with a transactional database your changes are only pencilled into the database and only right at the end, when you're sure you happy with everything do you tell the database to commit it to pen.
Any other reason for transactions?
Well - in our case yes. Testing!
Many of the sites we run walk the user through a long sequence of web-forms. At the end of the form the system has collected a lot of data together and has to comit it to the database. It may be committing a dozens of records to dozens of different tables so the queries can get pretty complex.
When you're testing a new process the last thing you want to do is to click through the whole process filling out forms every time you make a little change. And when you get to the end you don't want the system to be doing something different the second time based on the data that went in the last time.
Wouldn't it be lovely if, when in test mode, your website could do everything, 100% as normal... but if anything goes wrong, or even if everything goes right, it would undo everything it's done and leave you back exactly where you started so you could test it again!?
Transaction & Rollback for MyISAM
And so I set about writing a transaction & rollback mechanism for MyISAM... and it works.
If your DB class is abstracted properly (and is written in PHP) you should be able to add in this functionality with just a few lines of code.
Features:
- Commit & Rollback of statements in MyISAM
- Full support for single-table INSERT, DELETE, UPDATE
- Support for multiple-table UPDATE
- Support for concurrent-transactions
- Support for concurrency of transaction & non-transaction statements
- Automatic Rollback on_exit_handler()
Features waiting to be written!
- Support for table aliases
- Support for multiple-table DELETES
(should you even be doing that though?) - Support for tables with no PRIMARY KEY
Caveat Emptor
- This is not an excuse to ignore InnoDB
- This is not for production environments, it's a testing tool
How does it work?
It's really not very exciting or clever and it works like this:
First you start a transaction, for which you get a numeric ID.
Then rather than executing your statements yourself you pass your statement to the transaction object which takes care of executing it for you.
Just before it executes your statement it records, within the database, the state of what you're about to change.
If you want to rollback your transaction the transaction library simply executes another sequence of statements on the database to undo the changes you've just made and revert the database back to it's original state.
If you're happy with your changes and wish to 'commit' them to the database then the transaction library simply does nothing.
Finally, by default, if your process completes and no commit() has been received the transaction library will automatically rollback all transactions.
A few last notes and FAQs
What happens on a fatal error.
If your process dies completely and the on_exit_handler isn't called then you're stuck back where you were. Also if you disconnect from the database you're equally scuppered. (See above: Caveat Emptor)
Doesn't that make it pointless?
Of course not. It's an invaluable testing tool. It allows you to re-make the same request again and again in the knowledge that, when you're done, the database will be back exactly the way it was.
How fast is it?
I haven't benched it (yet) but it should be pretty fast depending on the amount of data you're changing. It stores the data in temporary tables and has little or no requirement for indexes etc.
Why don't you just re-image the database each time?
Because this is easier, because other people may be working in other parts of the database. Because don't want to keep re-imaging every step.
Any things it doesn't do?
Yes. No support for aliases such as UPDATE foo AS t1, bar as t2 SET t1.xxx = 1 WHERE etc. as yet ( partly becuase I don't tend to use them )
Also on the list to add is management of LOCK TABLE WRITE which at the moment it doesn't bother with.
Can I have it?
Yes, by all means. I need to upload it somewhere really, if there's any interest. For now if you want a look at the source just nudge me (jim) and I'll email it over.
Oh - and silly me, I should mention.. it's PHP (4) although it's not very long or complicated so it could easily be ported elsewhere.
A few suggestions...
I just checked it out, and it works awesome so far!
I noticed/did a couple of things:
Your idea is really good and you also worked it out very good!
Thanks
Paul
Great!
I've stumbled upon this, and just want to cheer you up. Seems great for implementing a undo system for a webapp.
The only thing that stops me for trying it, it's the alias thingy. All my queries use alias, so it would be a little pain in the ass to change them.
How difficult would be to implement it?
Great Indeed
Sounds great, MyISAM is my choice for most things but I often end up using a library that provides fake transaction support. This one looks like it's worth looking into!
@Paul: Yeah, I hear what you're saying about the auto-increment keys not being rolled back. I think that you're right and that the expected behaviour in native (InnoDB or Berkely DB) transactional support is to NOT roll back auto-incrementing keys in event of a booboo.
I guess you could design something that would automatically roll the keys back, but I don't see the point. I guess if you needed monthly stats for some special project you could just run some kind of re-keying function over a copy it that one time. Probably be safer anyway, particularly on a big table..........
Oemer
Crash
I want to test it now. But my question is how this library handles crashes. For instance i create a transaction with 100 queries and in the middle of the transaction the connection (online) crashes. How to rollback the transaction ?
Michael
Perfect
Thank you very much for making this available, I found it very useful!
:)
Jim
@Michael
No problem. The library needs another release really - it's got a couple of problems;
- It uses RegEx heavily which means to handle big queries you need to up your php.ini limits for the Perl Regex engine...
- It doesn't queries that use the word "where" in a text/char value because my RegEx is a bit lazy...
... but anyhow, glad you find it useful!!
Michael
One More Thing
The regex will barf if a ',' is contained within a query in an "unexpected" place, such as within the new value of a column being updated.
Workaround is to html encode all ',' in any values being sent to the db.
Otherwise it works really well! I'm impressed and grateful :)
Yves
my little contribution
Hi, thanks for this code, it works well. I had to change a few lines for requests with back quotes around column names.
ex:
in stored_update#parse
if ( preg_match ("/^s*UPDATEs*`?(.*?)`?s*SETs*(.*?)s*((WHERE|LIMIT|ORDER|$).*)/si", $sQuery, $mx ) ) { ... }
note the "`?" parts
and further in the same function :
preg_match("/`?(w+)`?s*=/",$setItem,$mx);
probably not the best solution but it works