Pandas

Integrating with Pandas: Extending Your Data Science Toolkit

Pixeltable dataframes do not hold data or allow you to update data (use insert/update/delete for that purpose). Another difference to pandas is that query execution needs to be initiated explicitly in order to return results. Pixeltable and Pandas aren't competitors, they're complementary. Here's how to use them together.

Import Pandas Data into Pixeltable

# %pip -q install pixeltable
import pixeltable as pxt
import pandas as pd

# Create a Pandas DataFrame (load from file, etc.)
df = pd.read_csv("my_data.csv")

# Create a Pixeltable table from the DataFrame 
table = pxt.io.import_pandas("my_table", df)
  • import_pandas: This function streamlines the creation of a Pixeltable table from a Pandas DataFrame, inferring the column types automatically.
  • Data Types: Pixeltable's rich type system handles different data types (int, float, string, etc.) from your Pandas DataFrame.

Query a Pixeltable and Get a Pandas DataFrame

filtered_df = table.where(table.some_column > 5).df()  # Get a filtered subset of data as a DataFrame
  • df() Method: This converts your Pixeltable query results into a Pandas DataFrame.
  • No data copying: Note that this usually doesn't copy the entire dataset into memory. Pixeltable intelligently retrieves only the data needed for the specific Pandas operations you're performing.

Compare Pixeltable and Pandas for Data Manipulation

You can use either Pandas or Pixeltable for interactive data manipulation and analysis, and work with Pixeltable table to leverage its lineage tracking, versioning, and integration with other ML tools.

Compute a new feature using PandasExtract calculated data using Pixeltable
df["test"] = df["0.250"] - df["0.229"] df["test"].head(5)table.select(table.c_0_250 - table.c_0_229).head(5)

πŸ“˜

Learn more about the differences in terms of data manipulation between Pandas and Pixeltable, here and see the API Reference.

Key Takeaways

Pixeltable seamlessly integrates with Pandas, enabling you to leverage both tools for a comprehensive AI development workflow. You can easily import and export Pandas DataFrames, use familiar Pandas operations for analysis, and then benefit from Pixeltable's powerful features like lineage tracking, version control, and deployment capabilities.

As a next step, you might check out one of the other tutorials, depending on your interests: