pandas Python Library
Jump to navigation
Jump to search
A pandas Python Library is a Python data transformation library and a Python data analysis library.
- Context:
- It can (typically) support a pandas Data Structure, such as pandas.DataFrame and pandas.Series.
- Example(s)
- Counter-Example(s):
- See: PyData, Tabular Data, OLAP Aggregation, OLAP Drill Down, Moving Window Function, Rolling Regression.
References
2017
- (Wikipedia, 2017) ⇒ https://en.wikipedia.org/wiki/Pandas_(software) Retrieved:2017-6-5.
- pandas is a software library written for the Python programming language for data manipulation and analysis. In particular, it offers data structures and operations for manipulating numerical tables and time series. Pandas is free software released under the three-clause BSD license. [1] The name is derived from the term “panel data", an econometrics term for multidimensional structured data sets.
2017b
2013a
- http://pandas.pydata.org/
- pandas is an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language.
2013b
- http://pandas.pydata.org/pandas-docs/stable/
- pandas is a Python package providing fast, flexible, and expressive data structures designed to make working with “relational” or “labeled” data both easy and intuitive. It aims to be the fundamental high-level building block for doing practical, real world data analysis in Python. Additionally, it has the broader goal of becoming the most powerful and flexible open source data analysis / manipulation tool available in any language. It is already well on its way toward this goal.
pandas is well suited for many different kinds of data:
- Tabular data with heterogeneously-typed columns, as in an SQL table or Excel spreadsheet.
- Ordered and unordered (not necessarily fixed-frequency) time series data.
- Arbitrary matrix data (homogeneously typed or heterogeneous) with row and column labels.
- Any other form of observational / statistical data sets. The data actually need not be labeled at all to be placed into a pandas data structure
- The two primary data structures of pandas, Series (1-dimensional) and DataFrame (2-dimensional), handle the vast majority of typical use cases in finance, statistics, social science, and many areas of engineering. For R users, DataFrame provides everything that R’s data.frame provides and much more. pandas is built on top of NumPy and is intended to integrate well within a scientific computing environment with many other 3rd party libraries.
- pandas is a Python package providing fast, flexible, and expressive data structures designed to make working with “relational” or “labeled” data both easy and intuitive. It aims to be the fundamental high-level building block for doing practical, real world data analysis in Python. Additionally, it has the broader goal of becoming the most powerful and flexible open source data analysis / manipulation tool available in any language. It is already well on its way toward this goal.
2012
- (McKinney, 2012) ⇒ Wes McKinney. (2012). “Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython." O'Reilly Media. ISBN:9781449323615