(1) What Is Pandas In Python?

  • Pandas is an open-source, python-based library used in data manipulation applications requiring high performance.
  • The name is derived from “Panel Data” having multidimensional data.
  • This was developed in 2008 by Wes McKinney and was developed for data analysis.
  • Pandas are useful in performing 5 major steps of data analysisLoad the data, clean/manipulate it, prepare it, model it, and analyze the data.

(2) Define Pandas dataFrame?

  • A data frame is a 2D mutable and tabular structure for representing data labelled with axesrows and columns.
  • The syntax for creating a data frame:
import pandas as pd
dataframe = pd.DataFrame(data=None, index=None, columns=None, dtype=None, copy=None)
  • data – Represents various forms like series, map, ndarray, lists, dict etc.
  • index – Optional argument that represents an index to row labels.
  • columns – Optional argument for column labels.
  • Dtype – the data type of each column. Again optional.

(3) Mention The Different Types Of Data Structures In Pandas?

Pandas have three different types of data structures. It is due to these simple and flexible data structures that it is fast and efficient.

  1. Series – It is a one-dimensional array-like structure with homogeneous data which means data of different data types cannot be a part of the same series. It can hold any data type such as integersfloats, and strings and its values are mutable i.e. it can be changed but the size of the series is immutable i.e. it cannot be changed.
  2. DataFrame – It is a two-dimensional array-like structure with heterogeneous data. It can contain data of different data types and the data is aligned in a tabular manner. Both size and values of DataFrame are mutable.
  3. Panel – The Pandas have a third type of data structure known as Panel, which is a 3D data structure capable of storing heterogeneous data but it isn’t that widely used.

(4) What Is Series Data Types In Pandas?

  • A Pandas Series is a one-dimensional array.
  • It holds any data type supported in Python and uses labels to locate each data value for retrieval.
  • These labels form the index, and they can be strings or integers.

Syntax:

s = pd.Series(data, index=index, dtype=None)
  1. date = Here data can be, A Python Dictionary, ndArray, a scalar value like 5 etc.
  • index = The passed index is the list of axis labels. Will default to RangeIndex (0, 1, 2, …, n) if not provided. 
  • dtype = str, numpy.dtype, or ExtensionDtype, optional.Data type for the output Series. If not specified, this will be inferred from data. See the user guide for more usages.

Example-1:

import pandas as pd 

data = [1, 2, 3, 4]

ser = pd.Series(data)

print(ser)
0    1
1    2
2    3
3    4
dtype: int64

Example-2:

import pandas as pd 

data = ['a','e','i','o','u']

ser = pd.Series(data)

for i in ser:
    print(i)
a
e
i
o
u

(5) What Is Panel Data Types In Pandas?

  • In Pandas, Panel is a significant container for three-dimensional data.
  • Panel Data container can hold ‘n’ number of data frames itself.
  • That’s why it is a 3 – Dimensional data container
s = pd.Panel(data=None, items=None, major_axis=None, minor_axis=None, copy=False, dtype=None)
  • items: Each item in this axis corresponds to one data frame, and this is called axis 0.

  • major_axis: This axis actually contains the rows or indexes of each of the data frames, and this is called axis 1.

  • minor_axis: This axis actually contains all the columns of each of the data frames, and this is called axis 2.

Example-1:

import pandas as pd

df1 = pd.DataFrame({'names': ['Subrat', 'Arpita', 'Abhispa', 'Subhada', 'Sonali'], 
                    'marks': [67, 75, 84, 90, 99]})

data = {'item1':df1, 'item2':df1}

panel = pd.Panel(data)

panel['item1']
	names	marks
0	Subrat	67
1	Arpita	75
2	Abhispa	84
3	Subhada	90
4	Sonali	99

Leave a Reply

Your email address will not be published. Required fields are marked *