1-Home
General Info
Useful Snippets

1.0.0 home page

Description:

Welcome to the Data Science Playbook Hello Data Science World! This project is a collection of code snippets for doing data science. This site is dedicated to using machine learning to train models that train humans to learn about machine learning. Why the project is useful Data science is better when you can grab code snippets from a recipe book. It makes every single data science byte extra tasty. How users can get started with the project - Feel free to grab snippets of code and use it. Just remember, This code is available on an "AS IS" basis, without any warranties or conditions of any kind, either expressed or implied Where users can get help with your project - If you have ideas for improving this repository, send me an email at DataSciencePlaybook@gmail.com Who maintains and contributes to the project This code is available on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

Example:

Topic Page
1-Home 1.0.0 home page
Import Data 2.0.0 Import from csv
Import Data 2.0.1 Import from tab delimited text file
Import Data 2.0.2 Import from file with custom delimiter
Import Data 2.0.3 Import from Excel file
Import Data 2.0.4 Import from pickle
Import Data 2.0.5 Import from Google Sheet
Import Data 2.0.6 Import from SQL database
Import Data 2.0.7 Import limited set of columns from text file
Import Data 2.0.8 Specify data types of imported data from text file
Import Data 2.0.9 Specify data types of imported data from Excel
Import Data 2.0.10 Import limited set of columns from Excel
Export Data 3.0.0 Export to csv
Export Data 3.0.1 Export to tab delimited text file
Export Data 3.0.2 Export to file with custom delimiter
Export Data 3.0.3 Export to Excel
Export Data 3.0.4 Export to pickle
Export Data 3.0.5 Export a dataframe to a database table
Export Data 3.0.6 Specify which column names you want to be exported for flat files
Data Connections 6.0.0 Construct a filepath
Data Connections 6.0.1 Add a Date Stamp for a filepath
Data Connections 6.0.2 Create a SQLite Connection
Python to Dataframe 1.0.0 Create Empty Dataframe
Python to Dataframe 1.0.1 Create Empty Dataframe with column names
Python to Dataframe 1.0.2 Create dataframe and add columns from lists
Python to Dataframe 1.0.3 convert a column to a list
Python to Dataframe 1.0.4 Export two columns to a python dictionary
Databases 5.0.0 Import a Password from a file stored outside your code
Databases 5.0.1 SQLiite Connection
Databases 5.0.2 Oracle Connection
Databases 5.0.3 SQL Server Connection Trusted Connection
Databases 5.0.4 Query a Database with SQL and create a DataFrame
Table Definitions 7.0.0 Get column names
Table Definitions 7.0.1 Get column datatypes
Table Definitions 7.0.2 Total rows and columns
Table Definitions 7.0.3 Total rows - including nulls
Table Definitions 7.0.4 Total columns
Table Definitions 7.0.5 get column names counts of values datatypes and memory usage
Table Definitions 7.0.6 Count non-null instances in column
Table Definitions 7.0.7 get number of rows that are NULL
Table Definitions 7.0.8 Get summary of null values for all columns
Table Definitions 7.0.9 Count unique or distinct values in a column
Table Definitions 7.0.10 Gget percent of times each value occurs
Table Definitions 7.0.11 count duplicate values and get sum of each value in column
Table Definitions 7.0.12 Count of values where value equals x in column
Table Definitions 7.0.13 Count number of duplicate rows
Table Modifications 8.0.0 Limit DataFrame to only specfic columns
Table Modifications 8.0.1 Set the order of the columns
Table Modifications 8.0.2 Drop specific colums
Table Modifications 8.0.3 Change column names
Table Modifications 8.0.4 Change column data type to float
Table Modifications 8.0.5 Change column data type to int
Table Modifications 8.0.6 Change column data type to string
Table Modifications 8.0.7 Change column data type to datetime
Table Modifications 8.0.8 Reference - Date Format Symbols
Data Formatting 9.0.0 Extract Year from Date
Data Formatting 9.0.1 Extract Month Number from Date
Data Formatting 9.0.2 Extract Date of the month from Date
Data Formatting 9.0.3 Extract Day of the Week Number from date
Data Formatting 9.0.4 Extract Date of the Week Name from date
Data Formatting 9.0.5 Extract Time from Date
Data Formatting 9.0.6 Extract Hour from Date
Data Formatting 9.0.7 Extract Minute from date
Data Formatting 9.0.8 Filtering with Dates
Data Formatting 9.0.9 difference between two dates
Data Formatting 9.0.10 Create a Lagged Feature
Data Formatting 9.0.11 deal with null values
Data Formatting 9.0.12 Apply a function to process text
Data Formatting 9.0.13 count total characters
Data Formatting 9.0.14 count total words
Data Formatting 9.0.15 count occurances of specific word
Data Formatting 9.0.16 capitalize first letter in sentance
Data Formatting 9.0.17 capitalize first letter in each word in sentance
Data Formatting 9.0.18 convert to upper case
Data Formatting 9.0.19 convert to lower case
Data Formatting 9.0.20 Remove punctuation from text
Data Formatting 9.0.21 strip front and back spaces
Data Formatting 9.0.22 stem words in a string
Data Formatting 9.0.23 return nth word in a string
Data Formatting 9.0.24 Return Nth sentance in a string
Data Formatting 9.0.25 Return Substring between two words
Conditional Logic 10.0.0 Query a DataFrame with SQL
Conditional Logic 10.0.1 Where Column Equals Value
Conditional Logic 10.0.2 Where Column Does NOT Equal Value
Conditional Logic 10.0.3 Where Column IN List
Conditional Logic 10.0.4 Where Column NOT IN List
Conditional Logic 10.0.5 Where Column is Null
Conditional Logic 10.0.6 Where Column is Not Null
Conditional Logic 10.0.7 Where multiple conditions are all true - AND Logic
Conditional Logic 10.0.8 Where one or more condition is true - OR Logic
Conditional Logic 10.0.9 CASE WHEN logic - Option 1 - pandas .loc
Conditional Logic 10.0.10 CASE WHEN logic - Option 2 np.where
Conditional Logic 10.0.11 CASE WHEN logic - Option 3 -create and apply a custom function
Conditional Logic 10.0.12 Create a rank column
Combine Group and Sort 11.0.0 JOIN or MERGE two dataframes
Combine Group and Sort 11.0.1 STACK and UNION DataFrames on top of each other
Combine Group and Sort 11.0.2 Append a new row to a DataFrame with a Dictionary
Combine Group and Sort 11.0.3 Append a new row to a DataFrame with a pd.Series
Combine Group and Sort 11.0.4 Use SQL to perform a Group By operation
Combine Group and Sort 11.0.5 Sort Ascending and descending
Combine Group and Sort 11.0.6 Get TOP x Rows
Combine Group and Sort 11.0.7 Get BOTTOM X Rows
Combine Group and Sort 11.0.8 Get random sample of X rows
Combine Group and Sort 11.0.9 filter to rows in a list of index values, such as a range
Combine Group and Sort 11.0.10 filter using index position
Combine Group and Sort 11.0.11 Create and iteratively fill an empty DataFrame
SQL DF nan.0 Format a SQL string with a parameter
Descriptive Stats 12.0.0 Generate summary statistics
Descriptive Stats 12.0.1 Sum
Descriptive Stats 12.0.2 Count non-null values
Descriptive Stats 12.0.3 average or mean
Descriptive Stats 12.0.4 median
Descriptive Stats 12.0.5 mode
Descriptive Stats 12.0.6 Standard Deviation
Descriptive Stats 12.0.7 Min
Descriptive Stats 12.0.8 Max
Descriptive Stats 12.0.9 Quantiles
Descriptive Stats 12.0.10 Two Standard Deviations from the mean
Descriptive Stats 12.0.11 Z-Score
Data Visualization 13.0.0 Bar Chart
Data Visualization 13.0.1 Histogram
Data Visualization 13.0.2 Cool Charts
Data Exploration for ML nan.0 Data Exploration for ML
Preprocessing 1.0.0 Use a Scikit-Learn Transformer
Preprocessing 1.0.1 Replace Null with most frequent value
Preprocessing 1.0.2 Create Dummy Variables with the OneHotEncoder
Pipelines 1.0.0 Full Pipeline example
Pipelines 1.0.1 Create a pipeline containing sub-pipelines
Pipelines 1.0.2 Fit a pipeline on a dataset
Pipelines 1.0.3 Use a fitted pipeline to transform a new dataset
Pipelines 1.0.4 Fit a pipeline and transform a dataset
Pipelines 1.0.5 Get feature names generated by a pipeline
Pipelines 1.0.6 Convert processed new data from numpy array to a DataFrame with names
Models nan.0 Full ML Example
useful snippets 14.0.0 Import custom libraries to run in your python program
useful snippets 14.0.1 Create a bat file to run a python file
useful snippets 14.0.2 prevent a bat file from closing after the file is done running until you press a key
useful snippets 14.0.3 Copy a file to a new location
useful snippets 14.0.4 Get the current date and time as a string
useful snippets 14.0.5 Log when an action happens
useful snippets 14.0.6 Run another python program from within your python program
useful snippets 14.0.7 Calculate the amount of memory available
useful snippets 14.0.8 Delete a dataframe from memory
useful snippets 14.0.9 Calculate how much time it takes to do something
useful snippets 14.0.10 Use SQL to see how many rows are in a table