Thursday, February 15, 2024

CHAPTER 14 Building a Machine Learning Forecasting Model in Python: A Step-by-Step Guide

Here's a basic framework for building a forecasting model using machine learning in Python:

  1. Data Collection and Preparation:

    • Collect relevant data for your forecasting task. This could be historical data of the metric you want to forecast.
    • Preprocess the data, handling missing values, outliers, and encoding categorical variables if necessary.
    • Split the data into training and testing sets.
  2. Feature Engineering:

    • Extract relevant features from the data that can help improve the forecasting accuracy.
    • Features could include lagged values, rolling statistics, seasonality indicators, etc.
  3. Model Selection:

    • Choose appropriate machine learning algorithms for your forecasting task. Common choices include:
      • Linear Regression
      • Decision Trees
      • Random Forests
      • Gradient Boosting Machines
      • Long Short-Term Memory (LSTM) Networks (for time series forecasting)
    • You may also consider ensemble methods or stacking multiple models for better performance.
  4. Model Training:

    • Train your selected models using the training data.
    • Tune hyperparameters using techniques like cross-validation or grid search to optimize model performance.
  5. Model Evaluation:

    • Evaluate the trained models using appropriate metrics such as Mean Absolute Error (MAE), Mean Squared Error (MSE), Root Mean Squared Error (RMSE), etc.
    • Compare the performance of different models to select the best one.
  6. Model Deployment:

    • Deploy the selected model for forecasting new data.
    • Monitor and update the model as needed.
MODEL WITH RANDOM SALES DATA:

Let's create a simple forecasting model and predict the sales price of an instrument for the next 30 years based on random sales price values for the last 10 years. We'll use linear regression for simplicity:
Sample Code:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression

# Generating random sales price values for the last 10 years
np.random.seed(0)
years = np.arange(2014, 2024)
sales_price = np.random.randint(500, 1500, size=10)

# Creating a DataFrame for the historical sales data
data = pd.DataFrame({'Year': years, 'Sales_Price': sales_price})

# Plotting the historical sales data
plt.figure(figsize=(10, 6))
plt.plot(data['Year'], data['Sales_Price'], marker='o', linestyle='-')
plt.title('Historical Sales Price of Instrument')
plt.xlabel('Year')
plt.ylabel('Sales Price')
plt.grid(True)
plt.show()

# Creating features and target variable
X = data[['Year']]
y = data['Sales_Price']

# Creating a linear regression model
model = LinearRegression()

# Fitting the model
model.fit(X, y)

# Predicting sales price for the next 30 years
future_years = np.arange(2024, 2054).reshape(-1, 1)
future_sales_price = model.predict(future_years)

# Creating DataFrame for future predictions
future_data = pd.DataFrame({'Year': future_years.flatten(), 'Sales_Price': future_sales_price})

# Plotting historical and predicted sales data
plt.figure(figsize=(10, 6))
plt.plot(data['Year'], data['Sales_Price'], marker='o', linestyle='-', label='Historical Data')
plt.plot(future_data['Year'], future_data['Sales_Price'], marker='o', linestyle='--', color='red', label='Predicted Data')
plt.title('Sales Price of Instrument (Historical and Predicted)')
plt.xlabel('Year')
plt.ylabel('Sales Price')
plt.legend()
plt.grid(True)
plt.show()


Explanation:

let's break down the code line by line: python Copy code import numpy as np import pandas as pd import matplotlib.pyplot as plt from sklearn.linear_model import LinearRegression These lines import necessary libraries: numpy for numerical operations, pandas for data manipulation, matplotlib.pyplot for plotting, and LinearRegression from sklearn.linear_model for fitting a linear regression model. python Copy code np.random.seed(0) years = np.arange(2014, 2024) sales_price = np.random.randint(500, 1500, size=10) Here, we set a random seed for reproducibility using np.random.seed(0). We then generate an array years containing years from 2014 to 2023, and generate random sales prices between 500 and 1500 for 10 years using np.random.randint(). python Copy code data = pd.DataFrame({'Year': years, 'Sales_Price': sales_price}) We create a DataFrame data using pd.DataFrame(), where we store the years and corresponding sales prices generated in the previous step. python Copy code plt.figure(figsize=(10, 6)) plt.plot(data['Year'], data['Sales_Price'], marker='o', linestyle='-') plt.title('Historical Sales Price of Instrument') plt.xlabel('Year') plt.ylabel('Sales Price') plt.grid(True) plt.show() This section plots the historical sales data. We create a figure with a size of 10x6 inches using plt.figure(figsize=(10, 6)). We then plot the sales price data against years using plt.plot(). The marker='o' and linestyle='-' arguments specify that markers should be used and lines should connect the markers. We set the title, labels for x and y axes, enable grid, and display the plot using plt.show(). python Copy code X = data[['Year']] y = data['Sales_Price'] Here, we create the feature matrix X and the target variable y. X contains the 'Year' column from the DataFrame, while y contains the 'Sales_Price' column. python Copy code model = LinearRegression() We instantiate a linear regression model using LinearRegression(). python Copy code model.fit(X, y) We fit the linear regression model to the data using the fit() method. This step trains the model on the historical data. python Copy code future_years = np.arange(2024, 2054).reshape(-1, 1) future_sales_price = model.predict(future_years) We generate future years from 2024 to 2053 using np.arange(). We reshape it to a column vector using reshape(-1, 1). Then, we use the trained model to predict the sales prices for these future years using the predict() method. python Copy code future_data = pd.DataFrame({'Year': future_years.flatten(), 'Sales_Price': future_sales_price}) We create a DataFrame future_data containing the future years and predicted sales prices. python Copy code plt.figure(figsize=(10, 6)) plt.plot(data['Year'], data['Sales_Price'], marker='o', linestyle='-', label='Historical Data') plt.plot(future_data['Year'], future_data['Sales_Price'], marker='o', linestyle='--', color='red', label='Predicted Data') plt.title('Sales Price of Instrument (Historical and Predicted)') plt.xlabel('Year') plt.ylabel('Sales Price') plt.legend() plt.grid(True) plt.show() Finally, we plot both historical and predicted sales prices on the same graph. We create a new figure, plot historical data, plot predicted data, set title, labels, legend, enable grid, and display the plot.


OUTPUT:

MODEL WITH USER DEFINED SALES DATA:

import numpy as np import pandas as pd import matplotlib.pyplot as plt from sklearn.linear_model import LinearRegression # Prompting the user to input sales prices for the last 10 years sales_data = [] for year in range(2014, 2024): sales_price = float(input(f"Enter sales price for year {year}: ")) sales_data.append((year, sales_price)) # Creating a DataFrame for the historical sales data data = pd.DataFrame(sales_data, columns=['Year', 'Sales_Price']) # Plotting the historical sales data plt.figure(figsize=(10, 6)) plt.plot(data['Year'], data['Sales_Price'], marker='o', linestyle='-') plt.title('Historical Sales Price of Instrument') plt.xlabel('Year') plt.ylabel('Sales Price') plt.grid(True) plt.show() # Creating features and target variable X = data[['Year']] y = data['Sales_Price'] # Creating a linear regression model model = LinearRegression() # Fitting the model model.fit(X, y) # Predicting sales price for the next 30 years future_years = np.arange(2024, 2054).reshape(-1, 1) future_sales_price = model.predict(future_years) # Creating DataFrame for future predictions future_data = pd.DataFrame({'Year': future_years.flatten(), 'Sales_Price': future_sales_price}) # Plotting historical and predicted sales data plt.figure(figsize=(10, 6)) plt.plot(data['Year'], data['Sales_Price'], marker='o', linestyle='-', label='Historical Data') plt.plot(future_data['Year'], future_data['Sales_Price'], marker='o', linestyle='--', color='red', label='Predicted Data') plt.title('Sales Price of Instrument (Historical and Predicted)') plt.xlabel('Year') plt.ylabel('Sales Price') plt.legend() plt.grid(True) plt.show()

OUTPUT:

Enter sales price for year 2014: 450
Enter sales price for year 2015: 470
Enter sales price for year 2016: 478
Enter sales price for year 2017: 500
Enter sales price for year 2018: 560
Enter sales price for year 2019: 580
Enter sales price for year 2020: 590
Enter sales price for year 2021: 600
Enter sales price for year 2022: 603
Enter sales price for year 2023: 605






Sunday, February 11, 2024

CHAPTER 13 AI BASED VEHICLE (NUMBER PLATE DATA) RECOGNITION SYSTEM

In the era of rapid technological advancement, artificial intelligence (AI) has emerged as a transformative force across various domains, revolutionizing conventional processes and enhancing efficiency. One such area witnessing remarkable innovation is the field of transportation, where AI-powered solutions are reshaping the landscape of vehicle recognition systems. Among these, AI-based vehicle number plate recognition systems stand out as a pioneering application with multifaceted benefits and implications.

Vehicle number plate recognition, also known as Automatic Number Plate Recognition (ANPR) or License Plate Recognition (LPR), refers to the automated detection and interpretation of vehicle license plates through the utilization of AI algorithms and computer vision techniques. By harnessing the power of machine learning, neural networks, and image processing, these systems can accurately extract alphanumeric characters from license plates, decode them, and subsequently analyze the acquired data for diverse purposes.

The significance of AI-based vehicle number plate recognition systems transcends mere identification; it extends to critical functionalities across various sectors. In law enforcement, such systems play a pivotal role in enhancing public safety and security by enabling swift and accurate identification of vehicles involved in criminal activities, traffic violations, or Amber Alerts. Moreover, in toll collection and parking management, ANPR systems streamline operations, facilitate seamless transactions, and mitigate revenue leakages by automating the process of fee collection and vehicle tracking.

Furthermore, the integration of AI-driven insights from vehicle number plate data holds immense potential in facilitating urban planning, traffic management, and transportation analytics. By analyzing patterns in vehicle movements, congestion hotspots, and commuting behaviors, city authorities can optimize infrastructure planning, improve traffic flow, and mitigate environmental impacts.

However, the deployment of AI-based vehicle number plate recognition systems also raises pertinent considerations regarding privacy, data security, and ethical usage. Striking a balance between the benefits of enhanced surveillance capabilities and the protection of individual liberties remains a critical challenge in the adoption and regulation of such technologies.

In this context, this paper aims to explore the intricacies of AI-based vehicle number plate recognition systems comprehensively. By delving into the underlying technologies, applications, challenges, and ethical considerations, we seek to elucidate the transformative potential of these systems while addressing the imperative of responsible deployment and governance in a rapidly evolving technological landscape.


Code:

import cv2 import csv from datetime import datetime import pytesseract # Load pre-trained Haarcascades for license plate detection plate_cascade = cv2.CascadeClassifier(cv2.data.haarcascades + 'haarcascade_russian_plate_number.xml') # Initialize video capture from the inbuilt camera (0 for default camera) cap = cv2.VideoCapture(0) # Create and open a CSV file to store number plate data along with date/time csv_file = open('number_plate_data.csv', 'w', newline='') csv_writer = csv.writer(csv_file) csv_writer.writerow(['Date', 'Number Plate']) while True: # Read frame from camera ret, frame = cap.read() if not ret: print("Failed to capture image") break # Convert frame to grayscale gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY) # Detect license plates in the grayscale frame plates = plate_cascade.detectMultiScale(gray, scaleFactor=1.3, minNeighbors=5) # Process detected license plates for (x, y, w, h) in plates: # Extract the number plate region from the frame plate_roi = gray[y:y+h, x:x+w] # Perform OCR (Optical Character Recognition) on the plate region plate_text = pytesseract.image_to_string(plate_roi, config='--psm 8') # Adjust psm value based on your image # Get the current date and time current_date = datetime.now().strftime('%Y-%m-%d %H:%M:%S') # Save the number plate data along with the date/time to the CSV file csv_writer.writerow([current_date, plate_text.strip()]) # Draw rectangle around the detected license plate cv2.rectangle(frame, (x, y), (x+w, y+h), (255, 0, 0), 2) # Display the frame with detected license plates cv2.imshow('License Plate Detection', frame) # Check for key press, if 'q' is pressed, exit the loop if cv2.waitKey(1) & 0xFF == ord('q'): break # Release video capture, close CSV file, and close all OpenCV windows cap.release() csv_file.close() cv2.destroyAllWindows()



This code utilizes Tesseract OCR to extract the text from the detected number plate region. The extracted text is then written to the CSV file along with the date/time stamp. Make sure to adjust the Tesseract configuration (config) parameters based on your specific requirements and the quality of the input images.

If you want to integrate OCR (Optical Character Recognition) to extract the actual number plate data, you can use the Tesseract OCR engine along with the pytesseract library. Make sure you have Tesseract installed on your system. You can install pytesseract using pip:

pip install pytesseract

Output:

The CSV file named number_plate_data.csv will be created in the same directory where your Python script is located. If you run the Python script in a directory, the CSV file will be generated in that directory.

After running the script, you can navigate to the directory where your Python script is located, and you should find the number_plate_data.csv file there. You can open this CSV file using any text editor or spreadsheet software like Microsoft Excel to view its contents.




Saturday, February 10, 2024

CHAPTER 12 AI BASED WASHING MACHINE CONTROL

Designing a machine learning model to fully control a washing machine involves several steps, including data collection, preprocessing, model training, and deployment. Here's a simplified example using Tensor Flow to demonstrate the process:
Data Collection: 
Collect data from sensors such as temperature sensors, water level sensors, and load sensors. You'll need labeled data indicating the optimal settings for each type of fabric and level of dirtiness.

Preprocessing: 
Prepare the data for training by normalizing features, handling missing values, and splitting it into training and testing sets.

Model Training: Train a machine learning model, such as a neural network, to predict the optimal settings based on input features like fabric type, dirtiness level, and previous washing history.

Model Evaluation: Evaluate the model's performance on the testing dataset to ensure it generalizes well to unseen data.

Deployment: Deploy the trained model to the washing machine's control system, allowing it to make real-time decisions based on sensor inputs.

Whole Program:

import numpy as np from sklearn.linear_model import LinearRegression # Get user input for fabric type and dirtiness level fabric_type = int(input("Enter fabric type (1 for cotton, 2 for silk, 3 for wool): ")) dirtiness_level = int(input("Enter dirtiness level (1 for low, 2 for medium, 3 for high): ")) # Preprocess user input # Map fabric type to one-hot encoding fabric_type_encoding = np.zeros(3) fabric_type_encoding[fabric_type - 1] = 1 # Normalize dirtiness level dirtiness_level_normalized = (dirtiness_level - 1) / 2.0 # Scale to range [0, 1] # Combine features user_input_features = np.concatenate([fabric_type_encoding, [dirtiness_level_normalized]]) # Repeat the input to match the number of samples in the target values user_input_features_repeated = np.tile(user_input_features, (3, 1)) # Pre-trained model weights (you need to have pre-trained weights saved) # Assuming we have pre-trained weights as a list of arrays pretrained_weights = np.array([ [0.1, 0.2, 0.3, 0.4], # Water Level [0.2, 0.3, 0.4,0.5], # Temperature [0.5, 0.6, 0.7,0.8] # Duration ]) # Define a simple linear regression model model = LinearRegression() # Fit the model to the data model.fit(user_input_features_repeated, pretrained_weights) # Predict optimal settings for user input predicted_settings = model.predict(user_input_features.reshape(1, -1)) print("Predicted Settings:") print("Water Level:", predicted_settings[0][0]) print("Temperature:", predicted_settings[0][1]) print("Duration:", predicted_settings[0][2])


Output

Enter fabric type (1 for cotton, 2 for silk, 3 for wool): 3
Enter dirtiness level (1 for low, 2 for medium, 3 for high): 1
Predicted Settings:
Water Level: 0.26666666666666666
Temperature: 0.3666666666666667
Duration: 0.4666666666666666

Let's go through each line of the code and explain its purpose:

import numpy as np from sklearn.linear_model import LinearRegression
  • import numpy as np: Imports the NumPy library and assigns it the alias np. NumPy is used for numerical operations and array manipulation.
  • from sklearn.linear_model import LinearRegression: Imports the LinearRegression class from scikit-learn's linear_model module. This class is used to fit linear regression models.

# Get user input for fabric type and dirtiness level
fabric_type = int(input("Enter fabric type (1 for cotton, 2 for silk, 3 for wool): "))
dirtiness_level = int(input("Enter dirtiness level (1 for low, 2 for medium, 3 for high): "))


  • input("Enter fabric type (1 for cotton, 2 for silk, 3 for wool): "): Prompts the user to enter the fabric type and reads the input as a string.
  • int(...): Converts the string input to an integer.
  • fabric_type = ...: Assigns the integer value entered by the user to the variable fabric_type.
  • Similarly, dirtiness_level = ... reads and assigns the user input for the dirtiness level.

# Preprocess user input # Map fabric type to one-hot encoding fabric_type_encoding = np.zeros(3) fabric_type_encoding[fabric_type - 1] = 1


  • np.zeros(3): Creates a NumPy array of zeros with length 3. This array will be used to represent the one-hot encoding for fabric type.
  • fabric_type_encoding[fabric_type - 1] = 1: Sets the element corresponding to the fabric type entered by the user to 1, indicating the presence of that fabric type in the one-hot encoding.

# Normalize dirtiness level dirtiness_level_normalized = (dirtiness_level - 1) / 2.0 # Scale to range [0, 1]



(dirtiness_level - 1) / 2.0: Normalizes the dirtiness level entered by the user to a value between 0 and 1. This is done by subtracting 1 from the entered level (to make it 0-indexed) and then dividing by 2.

# Combine features user_input_features = np.concatenate([fabric_type_encoding, [dirtiness_level_normalized]])

np.concatenate([...]): Combines the one-hot encoded fabric type and the normalized dirtiness level into a single array. This array represents the features used for prediction.


# Repeat the input to match the number of samples in the target values user_input_features_repeated = np.tile(user_input_features, (3, 1))

np.tile(user_input_features, (3, 1)): Repeats the single input features three times to match the number of samples in the target values. This ensures that both input features and target values have the same number of samples.

# Define a simple linear regression model model = LinearRegression()

model = LinearRegression(): Creates an instance of the LinearRegression class, which represents the linear regression model.

# Fit the model to the data model.fit(user_input_features_repeated, pretrained_weights)


model.fit(...): Fits the linear regression model to the training data. The user_input_features_repeated are the input features, and pretrained_weights are the target values.

# Predict optimal settings for user input predicted_settings = model.predict(user_input_features.reshape(1, -1))


model.predict(...): Uses the trained model to make predictions on the input features. The reshaping is done to ensure compatibility with the predict method, which expects a 2D array.


print("Predicted Settings:") print("Water Level:", predicted_settings[0][0]) print("Temperature:", predicted_settings[0][1]) print("Duration:", predicted_settings[0][2])

Prints the predicted settings for the user input. Each line prints one predicted setting (water level, temperature, duration) along with its corresponding value.

CHAPTER 18 EXPLORING THERMODYNAMICS WITH PYTHON: UNDERSTANDING CARNOT'S THEOREM AND MORE

  Python is a versatile programming language that can be used to simulate and analyze various physical phenomena, including thermal physics ...