Almost every organization depends on large databases. These are essentially collections of tables and connected through columns. These database systems support SQL, the Structured Query Language, which is used to create, access, and manipulate the data. SQL is used to access data, and also to create and exploit the relationships between the stored data. Additionally, these databases support database normalization rules for avoiding the redundancy of data.
Database Programming in Python
The Python programming language has powerful features for database programming. Python supports various databases like MySQL, Oracle, Sybase, PostgreSQL, etc. Python also supports Data Definition Language (DDL), Data Manipulation Language (DML), and Data Query Statements. For database programming, the Python DB API is a widely used module that provides a database application programming interface.
DB-API (SQL-API) for Python
Python provides DB-API which is independent of any database engine and it enables you to write Python scripts to access any database engine. The Python DB-API implementation for different databases are as follows –
- For MySQL it is MySQLdb.
- For PostgreSQL it is psycopg, PyGresQL and pyPgSQL
- For Oracle it has dc_oracle2 and cx_oracle.
- For DB2 DB-API implementation is Pydb2.
Python’s DB-API consists of connection objects, cursor objects, standard exceptions, and some other module contents.
Connection objects in DB-API of Python create a connection with the database which is further used for different transactions. These connection objects are also used as representatives of the database session.
A connection to a database in Python is created as follows:
conn = MySQLdb.connect('library', user='user_name', password='python')
After connection to a database has been established, you can use it for calling objects like commit(), rollback(), and close().
Commit: In a general sense, a commit is the updating of a record in a database. In the context of a database transaction, a commit refers to the saving of data permanently after a set of tentative changes. A commit ends a transaction within a relational database and allows all other users to see the changes.
Rollback: In database technologies, a rollback is an operation that returns the database to some previous state. Rollbacks are important for database integrity because they mean that the database can be restored to a clean copy even after erroneous operations are performed.
cur = conn.cursor() //creates new cursor object for executing SQL statements conn.commit() //Commits the transactions conn.rollback() //Roll back the transactions conn.close() //closes the connection conn.callproc(proc,param) //call stored procedure for execution conn.getsource(proc) //fetches stored procedure code
The cursor is one of the powerful features of SQL. These are objects that are responsible for submitting various SQL statements to a database server. There are several cursor classes in MySQLdb.cursors:
- BaseCursor is the base class for Cursor objects.
- Cursor is the default cursor class. CursorWarningMixIn, CursorStoreResultMixIn, CursorTupleRowsMixIn, and BaseCursor are some components of the cursor class.
- CursorStoreResultMixIn uses the mysql_store_result() function to retrieve result sets from the executed query. These result sets are stored at the client side.
- CursorUseResultMixIn uses the mysql_use_result() function to retrieve result sets from the executed query. These result sets are stored at the server side.
The following example illustrates the execution of SQL commands using cursor objects. You can use execute to execute SQL commands like SELECT. To commit all SQL operations you need to close the cursor as cursor.close().
cursor.execute('SELECT * FROM books') cursor.execute('''SELECT * FROM books WHERE book_name = 'python' AND book_author = 'Mark Lutz') cursor.close()
These objects represent a database cursor, which is used to manage the context of a fetch operation. Cursors created from the same connection are not isolated, i.e., any changes done to the database by a cursor are immediately visible by the other cursors. Cursors created from different connections can or can not be isolated, depending on how the transaction support is implemented.
Cursor Objects should respond to the following methods and attributes.
.description: This read-only attribute is a sequence of 7-item sequences. Each of these sequences contains information describing one result column:
The first two items (name and type_code) are mandatory, the other five are optional and are set to None if no meaningful values can be provided. This attribute will be None for operations that do not return rows or if the cursor has not had an operation invoked via the .execute*() method yet.
.rowcount: This read-only attribute specifies the number of rows that the last .execute*() produced (for DQL statements like SELECT) or affected (for DML statements like UPDATE or INSERT). The attribute is -1 in case no .execute*() has been performed on the cursor or the rowcount of the last operation cannot be determined by the interface.
This method is optional since not all databases provide stored procedures.
.callproc(procname [, parameters ]): It calls a stored database procedure with the given name. The sequence of parameters must contain one entry for each argument that the procedure expects. The result of the call is returned as a modified copy of the input sequence. Input parameters are left untouched, output and input/output parameters replaced with possibly new values. The procedure may also provide a result set as output. This must then be made available through the standard .fetch*() methods.
.close(): Close the cursor now. The cursor will be unusable from this point forward; an Error (or subclass) exception will be raised if any operation is attempted with the cursor.
.execute(operation [, parameters]): Prepare and execute a database operation (query or command). Parameters may be provided as sequence or mapping and will be bound to variables in the operation. Variables are specified in a database-specific notation. A reference to the operation will be retained by the cursor. If the same operation object is passed in again, then the cursor can optimize its behavior. This is most effective for algorithms where the same operation is used, but different parameters are bound to it (many times).
For maximum efficiency when reusing an operation, it is best to use the .setinputsizes() method to specify the parameter types and sizes ahead of time. It is legal for a parameter to not match the predefined information; the implementation should compensate, possibly with a loss of efficiency.
The parameters may also be specified as a list of tuples to e.g. insert multiple rows in a single operation, but this kind of usage is deprecated: .executemany() should be used instead.
.executemany(operation, seq_of_parameters): Prepare a database operation (query or command) and then execute it against all parameter sequences or mappings found in the sequence seq_of_parameters. Modules are free to implement this method using multiple calls to the .execute() method or by using array operations to have the database process the sequence as a whole in one call.
You can use this method for an operation that produces one or more result sets that constitutes undefined behavior, and the implementation is permitted (but not required) to raise an exception when it detects that a result set has been created by an invocation of the operation.
.fetchone(): Fetch the next row of a query result set, returning a single sequence, or None when no more data is available. An Error (or subclass) exception is raised if the previous call to .execute*() did not produce any result set or no call was issued yet.
.fetchmany([size=cursor.arraysize]): Fetch the next set of rows of a query result, returning a sequence of sequences (e.g. a list of tuples). An empty sequence is returned when no more rows are available. The number of rows to fetch per call is specified by the parameter. If it is not given, the cursor’s arraysize determines the number of rows be fetched. The method should try to fetch as many rows as indicated by the size parameter. If this is not possible due to the specified number of rows not being available, fewer rows may be returned. An Error (or subclass) exception is raised if the previous call to .execute*() did not produce any result set or no call was issued yet.
.fetchall(): Fetch all (remaining) rows of a query result, returning them as a sequence of sequences (e.g. a list of tuples). Note that the cursor’s arraysize attribute can affect the performance of this operation. An Error (or subclass) exception is raised if the previous call to .execute*() did not produce any result set or no call was issued yet.
.nextset(): This method will make the cursor skip to the next available set, discarding any remaining rows from the current set. If there are no more sets, the method returns None. Otherwise, it returns a true value, and subsequent calls to the .fetch*() methods will return rows from the next result set. An Error (or subclass) exception is raised if the previous call to .execute*() did not produce any result set or no call was issued yet.
.arraysize: This read/write attribute specifies the number of rows to fetch at a time with .fetchmany(). It defaults to 1 meaning to fetch a single row at a time. Implementations must observe this value concerning the .fetchmany() method, but are free to interact with the database a single row at a time. It may also be used in the implementation of .executemany().
.setinputsizes(sizes): This can be used before a call to .execute*() to predefine memory areas for the operation’s parameters. sizes are specified as a sequence — one item for each input parameter. The item should be a Type Object that corresponds to the input that will be used, or it should be an integer specifying the maximum length of a string parameter. If the item is None, then no predefined memory area will be reserved for that column (this is useful to avoid predefined areas for large inputs). This method would be used before the .execute*() method is invoked.
.setoutputsize(size [, column]): Set a column buffer size for fetches of large columns (e.g. LONGs, BLOBs, etc.). The column is specified as an index into the result sequence. Not specifying the column will set the default size for all large columns in the cursor. This method would be used before the .execute*() method is invoked.
Python and MySQL
Python and MySQL are a good combination to develop database applications. Before proceeding, you make sure you have MySQLdb installed on your machine. Just type the following in your Python script and execute it.
The first step in using a database in a program is to establish a database connection.
import MySQLdb # Open database connection db = MySQLdb.connect("localhost","testuser","test123","TESTDB" ) # prepare a cursor object using cursor() method cursor = db.cursor() # execute SQL query using execute() method. cursor.execute("SELECT VERSION()") # Fetch a single row using fetchone() method. data = cursor.fetchone() print "Database version : %s " % data # disconnect from server db.close()
If a connection is established with the data source, then a Connection Object is returned and saved into db for further use, otherwise, db is set to None. Next, the db object is used to create a cursor object, which in turn is used to execute SQL queries. Finally, before coming out, it ensures that the database connection is closed and resources are released.
Creating a Database Table
Once a database connection is established, we are ready to create tables or records into the database using the execute method of the created cursor.
Following code creates a database table “Employee”:
import MySQLdb # Open database connection db = MySQLdb.connect("localhost","testuser","test123","TESTDB" ) # prepare a cursor object using cursor() method cursor = db.cursor() # Drop table if it already exist using execute() method. cursor.execute("DROP TABLE IF EXISTS EMPLOYEE") # Create table as per requirement sql = """CREATE TABLE EMPLOYEE ( FIRST_NAME CHAR(20) NOT NULL, LAST_NAME CHAR(20), AGE INT, SEX CHAR(1), INCOME FLOAT )""" cursor.execute(sql) # disconnect from server db.close()
It is required when you want to create your records into a database table. The following example, executes SQL INSERT statement to create a record into EMPLOYEE table:
import MySQLdb # Open database connection db = MySQLdb.connect("localhost","testuser","test123","TESTDB" ) # prepare a cursor object using cursor() method cursor = db.cursor() # Prepare SQL query to INSERT a record into the database. sql = """INSERT INTO EMPLOYEE(FIRST_NAME, LAST_NAME, AGE, SEX, INCOME) VALUES ('Mac', 'Mohan', 20, 'M', 2000)""" try: # Execute the SQL command cursor.execute(sql) # Commit your changes in the database db.commit() except: # Rollback in case there is any error db.rollback() # disconnect from server db.close()
READ operation on any database means to fetch some useful information from the database. Once your database connection is established, you are ready to make a query into this database. You can use either fetchone() method to fetch a single record or fetchall() method to fetch multiple values from a database table.
- fetchone(): It fetches the next row of a query result set. A result set is an object that is returned when a cursor object is used to query a table.
- fetchall(): It fetches all the rows in a result set. If some rows have already been extracted from the result set, then it retrieves the remaining rows from the result set.
- rowcount: This is a read-only attribute and returns the number of rows that were affected by an execute() method.
The following code queries all the records from EMPLOYEE table having salary more than 1000:
import MySQLdb # Open database connection db = MySQLdb.connect("localhost","testuser","test123","TESTDB" ) # prepare a cursor object using cursor() method cursor = db.cursor() sql = "SELECT * FROM EMPLOYEE \ WHERE INCOME > '%d'" % (1000) try: # Execute the SQL command cursor.execute(sql) # Fetch all the rows in a list of lists. results = cursor.fetchall() for row in results: fname = row lname = row age = row sex = row income = row # Now print fetched result print "fname=%s,lname=%s,age=%d,sex=%s,income=%d" % \ (fname, lname, age, sex, income ) except: print "Error: unable to fecth data" # disconnect from server db.close()
UPDATE operation on any database means to update one or more records, which are already available in the database. The following code updates all the records having SEX as ‘M’ by increasing AGE by one year.
import MySQLdb # Open database connection db = MySQLdb.connect("localhost","testuser","test123","TESTDB" ) # prepare a cursor object using cursor() method cursor = db.cursor() # Prepare SQL query to UPDATE required records sql = "UPDATE EMPLOYEE SET AGE = AGE + 1 WHERE SEX = '%c'" % ('M') try: # Execute the SQL command cursor.execute(sql) # Commit your changes in the database db.commit() except: # Rollback in case there is any error db.rollback() # disconnect from server db.close()
It is required when you want to delete some records from your database. Following is the code to delete all the records from EMPLOYEE where AGE is more than 20:
import MySQLdb # Open database connection db = MySQLdb.connect("localhost","testuser","test123","TESTDB" ) # prepare a cursor object using cursor() method cursor = db.cursor() # Prepare SQL query to DELETE required records sql = "DELETE FROM EMPLOYEE WHERE AGE > '%d'" % (20) try: # Execute the SQL command cursor.execute(sql) # Commit your changes in the database db.commit() except: # Rollback in case there is any error db.rollback() # disconnect from server db.close()
Commit is the operation, which gives a green signal to a database to finalize the changes, and after this operation, no change can be reverted. The following code is used to perform commit operation:
If you are not satisfied with one or more of the changes and you want to revert those changes completely, then you can use the rollback() method. The following code will perform the rollback operation.
Disconnecting the Database
To disconnect the Database connection, you can use the close() method:
If the connection to a database is closed by the user with the close() method, any outstanding transactions are rolled back by the database. However, instead of depending on any of the database lower-level implementation details, your application would be better off calling commit or rollback explicitly.