Python DB-API to numpy

Recently, I was looking for “best practices” for getting data from MySQL to a numpy array. Thanks to Tim Hochberg on the numpy mailing list, we get the following as the apparently fastest way to bridge the DB-numpy gap:

# CREATE TABLE demo (x REAL, y REAL)
c = conn.cursor()
c.execute('SELECT * FROM demo')
y = numpy.fromiter(c, dtype=[('a',float), ('b', float)])

At least, it was the fastest back in 2006.

3 Comments

  1. Thanks for this!

  2. Cursor iterators return pyodbc.Row objects, which numpy does not seem to know how to deal with but are essentially an extended tuplse. Therefore in order to use the fromiter method to make a numpy structured array from a mysql or other db query, you must change the iterator slightly. I did the following:

    The generator expression simply casts the Row object to a tuple.

    cursor = conn.cursor()
    cursor.execute('SELECT a, b FROM demo')

    result = np.fromiter((tuple (row) for row in cursor), dtype=[('a',float), ('b', float)])

  3. It's also a good idea to use the fromiter count parameter so it knows the size of array to create:
    result = np.fromiter((tuple (row) for row in cursor), dtype=[(‘a’,float), (‘b’, float)], count = cursor.rowcount)

Leave your thoughts
  • You can use some HTML in your comment.
  • Your comment may not display immediately due to spam filtering. Please wait for moderation.