Python Pandas Ditches NumPy for Speedier PyArrow

by blacktulipon 6/4/25, 6:30 AMwith 4 comments
by joshlkon 6/4/25, 9:14 AM

> [numpy] stores everything in rows

This isn't true. Pandas uses Numpy to store columns of data. Theres quite a few technical errors in the article.

by constantcryingon 6/4/25, 8:17 AM

This is an insane article, I do not think the author has any idea what is going on.

The comparison of numpy reading CSV to arrow reading parquet is completely bizarre and totally misses the point of switching out the underlying data format.

by agonson 6/4/25, 6:09 PM

It gets worse the further you go, this was where I had to bail:

> the format is much favored by AI frameworks such as TensorFlow and PyCharm.