09.15.24

Ripping an image based PDF to text (and old TurboBASIC commands)

Recently, I wanted to pull text from a PDF that was scanned in from an old manual, namely the Borland Turbo Basic manual. The text that was in the document was garbage, nothing there to be done, so I decided to write something that would allow me to:

1. Load a PDF
2. Rip the images from the PDF
3. OCR the images to generate the text

So, here it is.

import fitz  # PyMuPDF
from PIL import Image
import pytesseract
import io

# Set the path to the Tesseract executable
pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"

def pdf_to_text(pdf_path: str):
    # Open the PDF file
    pdf_document = fitz.open(pdf_path)
    text_output = []

    # Iterate through each page in the PDF
    for page_num in range(pdf_document.page_count):
        print(f"Processing page {page_num + 1} of {pdf_document.page_count}...")
        
        # Get the page object
        page = pdf_document.load_page(page_num)
        
        # Convert the page to a pixmap (image)
        pix = page.get_pixmap()
        
        # Convert the pixmap into a PIL image
        img = Image.open(io.BytesIO(pix.tobytes("png")))
        
        # Run OCR on the image using pytesseract
        page_text = pytesseract.image_to_string(img)

        # Append the extracted text to the list
        text_output.append(f"Page {page_num + 1}:\n{page_text}\n")

    # Close the document after processing
    pdf_document.close()

    # Join all pages' text into a single string
    return "\n".join(text_output)

if __name__ == "__main__":
    pdf_path = "BorTurboBasic.pdf"  # Replace with the path to your PDF file
    extracted_text = pdf_to_text(pdf_path)

    # Save the text to a file
    with open("output_text.txt", "w", encoding="utf-8") as f:
        f.write(extracted_text)

    print("OCR complete. Text extracted and saved to 'output_text.txt'.")

You need to install Tesseract in order to rip the text and put the location into the pytesseract.pytesseract.tesseract_cmd variable. Also, it’s not perfect and it’s not great but it’s better than nothing. If the text is super garbled then it works to simply dump the text into something like Google Gemini and have it rip info you need from it.

Speaking of which, here is a cheat-sheet formatted and parsed from an old Turbo Basic manual.

Turbo Basic Commands (smackaay.com)

Enjoy!

Tags: , , , , , , , , , , , , , ,
| Posted in Programming | Comments Off on Ripping an image based PDF to text (and old TurboBASIC commands)
08.25.24

New prompt permutation script

A while back I made a prompt permutation script for generating large numbers of image prompts for use in automatic1111. I updated it with a new operator, the incremental operator ‘&’ so it will cycle through the list items instead of choosing random ones. Here is a sample prompt and output. Basically a fancy search and replace but I use it quite often.

photo, a %SIZE brutalist &BUILDING on a sunny day (this is the base prompt)
photo, painting
brutalist, post-modern, deconstructivism
sunny day, night time
%SIZE, small, medium, large, huge
&BUILDING, house, tower, factory, school
Output:
photo, a small brutalist house on a sunny day
photo, a huge brutalist tower on a night time
photo, a medium post-modern factory on a sunny day
photo, a medium post-modern school on a night time
photo, a medium deconstructivism house on a sunny day
photo, a large deconstructivism tower on a night time
painting, a medium brutalist factory on a sunny day
painting, a small brutalist school on a night time
painting, a huge post-modern house on a sunny day
painting, a small post-modern tower on a night time
painting, a large deconstructivism factory on a sunny day
painting, a medium deconstructivism school on a night time

Anyways, here is the python script along with an html version so you can use it with an interface of sorts.

http://smackaay.com/files/ppermute/ppermute.html The little webpage for it.

Here is the python script.

import itertools
import random

# File path assignments
INPUT_FILE_PATH = 'img5.txt'  # Change this to the path of your input file
OUTPUT_FILE_PATH = 'output.txt'  # Change this to the desired path for the output file

def load_file(file_path):
    with open(file_path, 'r') as file:
        lines = file.readlines()
    return [line.strip() for line in lines]

def generate_permutations(prompt, modifiers, random_modifiers, increment_modifiers):
    all_combinations = list(itertools.product(*modifiers))
    
    permutations = []
    increment_counters = {key: 0 for key in increment_modifiers.keys()}
    
    for combination in all_combinations:
        new_prompt = prompt
        for original, replacement in zip(modifiers, combination):
            new_prompt = replace_first(new_prompt, original[0], replacement)
        
        # Handle random modifiers
        for placeholder, values in random_modifiers.items():
            if placeholder in new_prompt:
                replacement = random.choice(values)
                new_prompt = new_prompt.replace(placeholder, replacement, 1)
        
        # Handle increment modifiers
        for placeholder, values in increment_modifiers.items():
            if placeholder in new_prompt:
                replacement = values[increment_counters[placeholder] % len(values)]
                new_prompt = new_prompt.replace(placeholder, replacement, 1)
                increment_counters[placeholder] += 1
        
        # Remove placeholders from the final prompt
        new_prompt = remove_placeholders(new_prompt, random_modifiers.keys() | increment_modifiers.keys())
        
        permutations.append(new_prompt)
    
    return permutations

def replace_first(text, search, replacement):
    if search not in text:
        raise ValueError(f"Term '{search}' not found in the prompt.")
    return text.replace(search, replacement, 1)

def remove_placeholders(text, placeholders):
    for placeholder in placeholders:
        text = text.replace(placeholder, "")
    return text

def save_to_file(output_path, permutations):
    with open(output_path, 'w') as file:
        for permutation in permutations:
            file.write(permutation + '\n')

def main():
    lines = load_file(INPUT_FILE_PATH)
    if not lines:
        print("The input file is empty.")
        return

    prompt = lines[0]
    modifiers = [line.split(', ') for line in lines[1:] if not line.startswith('%') and not line.startswith('&')]
    random_modifiers = {}
    increment_modifiers = {}
    
    for line in lines[1:]:
        if line.startswith('%'):
            parts = line.split(', ')
            key = parts[0]
            values = parts[1:]
            random_modifiers[key] = values
        elif line.startswith('&'):
            parts = line.split(', ')
            key = parts[0]
            values = parts[1:]
            increment_modifiers[key] = values

    try:
        all_permutations = generate_permutations(prompt, modifiers, random_modifiers, increment_modifiers)
        save_to_file(OUTPUT_FILE_PATH, all_permutations)
        print(f"Generated prompts have been saved to {OUTPUT_FILE_PATH}")
        print(f"Total number of permutations: {len(all_permutations)}")
    except ValueError as e:
        print(f"Error: {e}")

if __name__ == "__main__":
    main()

Tags: , , , ,
| Posted in Personal stuff, Programming | Comments Off on New prompt permutation script
08.14.24

Gage Block Buildup Calculator

So, I have that Python source code on the side of my site there for calculating gage block buildups. I figured it was time to turn it into a JS program so that people can just access it from the web. Not super complicated but useful nonetheless.

http://smackaay.com/files/gbcalc/gbcalc.html

Features as follows:

  • Imperial 81, 28, 34, 36 and 92 pc sets
  • Metric 88, 47 and 112 pc sets
  • Multiple ( as many as you want) results
  • The ability to remove blocks from the list if they are either missing or used in a previous buildup. This is handy.

Anyways, hope somebody out there enjoys this!

Tags: , ,
| Posted in Machining, Programming | Comments Off on Gage Block Buildup Calculator
08.5.24

The YouTube Recycle Bin

I was watching a video from a youtuber KVN AUST. The video: https://youtu.be/8uHFm6LK6PE?si=SLIaCEzNBx_iL97V It featured a map for looking at and searching for odd videos across YouTube. It’s pretty fun just to see little slices of life or weird things people would bother uploading so I made a little JS proggy to generate the most common search terms.

Select the prefix and the type of random term you want to find, click on Generate Search Term and then click Search on YouTube. You can select No Spaces or With Quotes if certain things don’t work. The random date is anything in the last 20 years. Enjoy!

YouTube Recycle Bin Search Generator




Tags: , ,
| Posted in Personal stuff, Programming | Comments Off on The YouTube Recycle Bin
07.10.24

A visit from an old friend, the boreGauge

A few years back we made a gauge for measuring large bores in hydraulic cylinders. Seems the company that bought it from us needed the software for it again. I had to dig through my old source code and see if I had a recent version, turns out I did. On this project I did the electronics, software and commissioned it.

An image of the BoreGauge

What the device does is, you place it in the bore, set your zeros and then measure the bore all the way down. This way you can see if there are any high spots, low spots or waviness. The software keeps track of the position as well and provides a csv file of the data and plots it on the screen.

This was a pretty fun project, I might redesign it and make a more substantial attempt at monetizing it later.

| Posted in Design, Electronics, Programming | Comments Off on A visit from an old friend, the boreGauge
05.28.24

StableDiffusion Permutation Script Update

So, like a week ago I wrote a script to make permutations for SD prompts. I’ve updated the script to allow for random terms as well. This allows one to add variance in the prompt but to not add to the number of permutations. Everything is explained in the code block comment. just change the filenames near the end of the script and run.

"""
Script: prompt_permutator.py

Description:
This script generates permutations of a given prompt with various modifiers.
The script reads an input file that contains a base prompt and lists of modifiers.
It creates all possible combinations of the modifiers and generates new prompts
based on these combinations. Additionally, it handles placeholders that are randomly
replaced with specified values and ensures these placeholders are not included in the final output.

Input File Format:
- The first line contains the base prompt.
- Subsequent lines contain comma-separated lists of modifiers.
- Lines starting with a placeholder (e.g., %1) are treated as random modifiers and are replaced
  with random values from the list provided.

Example Input File (test2.txt):

A %1 flower on a hill, photorealistic
on a hill, in a vase, on a bed
photorealistic, manga
%1, Red, Green, Blue

In this example:
- The base prompt is: "A %1 flower on a hill, photorealistic"
- The modifiers are: ["on a hill", "in a vase", "on a bed"] and ["photorealistic", "manga"]
- The placeholder %1 will be replaced with a random choice from ["Red", "Green", "Blue"]

Output:
The script generates all permutations of the prompt with the modifiers and replaces
the placeholder with a random value. The results are saved to an output file.

Usage:
1. Prepare an input file (e.g., 'test2.txt') following the described format.
2. Specify the input and output file paths in the script or pass them as arguments.
3. Run the script to generate the permutations and save them to the output file.

Example Execution:
$ python prompt_permutator.py

Dependencies:
- itertools
- random

Author:
Steven M

Date:
May 28, 2024

"""

import itertools
import random

def load_file(file_path):
    with open(file_path, 'r') as file:
        lines = file.readlines()
    return [line.strip() for line in lines]

def generate_permutations(prompt, modifiers, random_modifiers):
    # Create all combinations of modifiers
    all_combinations = list(itertools.product(*modifiers))
    
    permutations = []
    for combination in all_combinations:
        new_prompt = prompt
        for original, replacement in zip(modifiers, combination):
            new_prompt = replace_first(new_prompt, original[0], replacement)
        
        # Handle random modifiers
        for placeholder, values in random_modifiers.items():
            if placeholder in new_prompt:
                replacement = random.choice(values)
                new_prompt = new_prompt.replace(placeholder, replacement, 1)
        
        # Remove placeholders from the final prompt
        new_prompt = remove_placeholders(new_prompt, random_modifiers.keys())
        
        permutations.append(new_prompt)
    
    return permutations

def replace_first(text, search, replacement):
    # Helper function to replace only the first occurrence of a term
    if search not in text:
        raise ValueError(f"Term '{search}' not found in the prompt.")
    return text.replace(search, replacement, 1)

def remove_placeholders(text, placeholders):
    for placeholder in placeholders:
        text = text.replace(placeholder, "")
    return text

def save_to_file(output_path, permutations):
    with open(output_path, 'w') as file:
        for permutation in permutations:
            file.write(permutation + '\n')

def main(input_file_path, output_file_path):
    lines = load_file(input_file_path)
    if not lines:
        print("The input file is empty.")
        return

    prompt = lines[0]
    modifiers = [line.split(', ') for line in lines[1:] if not line.startswith('%')]
    random_modifiers = {}
    
    for line in lines[1:]:
        if line.startswith('%'):
            parts = line.split(', ')
            key = parts[0]
            values = parts[1:]
            random_modifiers[key] = values

    try:
        all_permutations = generate_permutations(prompt, modifiers, random_modifiers)
        save_to_file(output_file_path, all_permutations)
        print(f"Generated prompts have been saved to {output_file_path}")
        print(f"Total number of permutations: {len(all_permutations)}")
    except ValueError as e:
        print(f"Error: {e}")

if __name__ == "__main__":
    input_file_path = 'test2.txt'  # Change this to the path of your input file
    output_file_path = 'output.txt'  # Change this to the desired path for the output file
    main(input_file_path, output_file_path)

So, in essence, anything with a % at the beginning of the line will be processed differently and the term will be matched up. As always, I guarantee nothing.

Tags: , , ,
| Posted in Programming | Comments Off on StableDiffusion Permutation Script Update
05.28.24

Pong-2024

I was bored and made a quick Pong game. It’s not great, not terribly well finished but I wanted to see how good the tools are these days. It’s been a while since I wrote a game. It was fun to make. Give it a shot.

https://smackaay.com/webgames/pong2024/index.html

It’s output in HTML5 so no installation is required.

Tags: , , ,
| Posted in Programming | Comments Off on Pong-2024
05.23.24

Resolutions for SD image generation

When making images for StableDiffusion it’s best to take the aspect ratio in mind and make it fit into the total number of pixels that the model was trained on. This results in the best images for that given model. So, for SDXL it’s 1024×1024, others it may be 768×768 or even 512×512. Here is a list of effective X and Y values to total up to the most common aspect ratios for various training sizes. Obviously you would reverse the values if you go y/x.

1024x1024

Aspect Ratio 4:3 - Resolution: 1182x886
Aspect Ratio 16:9 - Resolution: 1365x768
Aspect Ratio 21:9 - Resolution: 1564x670
Aspect Ratio 1:1 - Resolution: 1024x1024
Aspect Ratio 3:2 - Resolution: 1254x836
Aspect Ratio 5:4 - Resolution: 1144x915
Aspect Ratio 16:10 - Resolution: 1295x809
Aspect Ratio 2:1 - Resolution: 1448x724
Aspect Ratio 18:9 - Resolution: 1448x724
Aspect Ratio 32:9 - Resolution: 1930x543
Aspect Ratio 3:1 - Resolution: 1773x591
Aspect Ratio 4:1 - Resolution: 2048x512
Aspect Ratio 5:3 - Resolution: 1321x793

768x768

Aspect Ratio 4:3 - Resolution: 886x665
Aspect Ratio 16:9 - Resolution: 1024x576
Aspect Ratio 21:9 - Resolution: 1173x502
Aspect Ratio 1:1 - Resolution: 768x768
Aspect Ratio 3:2 - Resolution: 940x627
Aspect Ratio 5:4 - Resolution: 858x686
Aspect Ratio 16:10 - Resolution: 971x607
Aspect Ratio 2:1 - Resolution: 1086x543
Aspect Ratio 18:9 - Resolution: 1086x543
Aspect Ratio 32:9 - Resolution: 1448x407
Aspect Ratio 3:1 - Resolution: 1330x443
Aspect Ratio 4:1 - Resolution: 1536x384
Aspect Ratio 5:3 - Resolution: 991x594

512x512

Aspect Ratio 4:3 - Resolution: 591x443
Aspect Ratio 16:9 - Resolution: 682x384
Aspect Ratio 21:9 - Resolution: 782x335
Aspect Ratio 1:1 - Resolution: 512x512
Aspect Ratio 3:2 - Resolution: 627x418
Aspect Ratio 5:4 - Resolution: 572x457
Aspect Ratio 16:10 - Resolution: 647x404
Aspect Ratio 2:1 - Resolution: 724x362
Aspect Ratio 18:9 - Resolution: 724x362
Aspect Ratio 32:9 - Resolution: 965x271
Aspect Ratio 3:1 - Resolution: 886x295
Aspect Ratio 4:1 - Resolution: 1024x256
Aspect Ratio 5:3 - Resolution: 660x396

So, if for some reason you need to calculate this on your own for some future or past resolution, here is the Python.

from sympy import symbols, Eq, solve

# Define symbols
x, y = symbols('x y')

# Equation 1: Total pixel count remains constant
total_pixels = 512*512

# List of common aspect ratios as tuples (width, height)
aspect_ratios = [
    (4, 3), (16, 9), (21, 9), (1, 1), (3, 2),
    (5, 4), (16, 10), (2, 1), (18, 9), (32, 9),
    (3, 1), (4, 1), (5, 3)
]

# Iterate over the aspect ratios and solve the equations
resolutions = []
for width_ratio, height_ratio in aspect_ratios:
    # Equation 2: Aspect ratio
    eq1 = Eq(x * y, total_pixels)
    eq2 = Eq(x / y, width_ratio / height_ratio)
    
    # Solve the equations
    solution = solve((eq1, eq2), (x, y))
    
    # Extract the resolution and convert to positive integers
    resolution = (abs(int(solution[0][0])), abs(int(solution[0][1])))
    resolutions.append((width_ratio, height_ratio, resolution))

# Print the results
for width_ratio, height_ratio, resolution in resolutions:
    print(f"Aspect Ratio {width_ratio}:{height_ratio} - Resolution: {resolution[0]}x{resolution[1]}")

As always, I guarantee nothing. enjoy.

Tags: , , ,
| Posted in Miscellaneous stuff, Programming | Comments Off on Resolutions for SD image generation
05.20.24

A quick script for creating prompt permutations for StableDiffusion

So, I enjoy making stuff in StableDiffusion and in the WebUI interface is an option for a prompt list. I like using the Prompt S/R in the scripts but it tends to top out after about 1500 permutations. Here is a script for generating those in as many dimensions as you want.

import itertools

def load_file(file_path):
    with open(file_path, 'r') as file:
        lines = file.readlines()
    return [line.strip() for line in lines]

def generate_permutations(prompt, modifiers):
    # Create all combinations of modifiers
    all_combinations = list(itertools.product(*modifiers))
    
    permutations = []
    for combination in all_combinations:
        new_prompt = prompt
        for original, replacement in zip(modifiers, combination):
            new_prompt = replace_first(new_prompt, original[0], replacement)
        permutations.append(new_prompt)
    
    return permutations

def replace_first(text, search, replacement):
    # Helper function to replace only the first occurrence of a term
    if search not in text:
        raise ValueError(f"Term '{search}' not found in the prompt.")
    return text.replace(search, replacement, 1)

def save_to_file(output_path, permutations):
    with open(output_path, 'w') as file:
        for permutation in permutations:
            file.write(permutation + '\n')

def main(input_file_path, output_file_path):
    lines = load_file(input_file_path)
    if not lines:
        print("The input file is empty.")
        return

    prompt = lines[0]
    modifiers = [line.split(', ') for line in lines[1:]]

    try:
        all_permutations = generate_permutations(prompt, modifiers)
        save_to_file(output_file_path, all_permutations)
        print(f"Generated prompts have been saved to {output_file_path}")
        print(f"Total number of permutations: {len(all_permutations)}")
    except ValueError as e:
        print(f"Error: {e}")

if __name__ == "__main__":
    input_file_path = 'testprompt.txt'  # Change this to the path of your input file
    output_file_path = 'output.txt'  # Change this to the desired path for the output file
    main(input_file_path, output_file_path)

This Python script will allow you to generate a permutation of a prompt based on the original words that are in the prompt. So, for example, here is the prompt file with the first line being the raw prompt and the following lines being the search and replace terms:

A beautiful flower sitting on a wooden table. in a kitchen
beautiful flower, tall woman, eager beaver, soup can
sitting, jumping, glowering

This will generate the following in output.txt

A beautiful flower sitting on a wooden table. in a kitchen
A beautiful flower jumping on a wooden table. in a kitchen
A beautiful flower glowering on a wooden table. in a kitchen
A tall woman sitting on a wooden table. in a kitchen
A tall woman jumping on a wooden table. in a kitchen
A tall woman glowering on a wooden table. in a kitchen
A eager beaver sitting on a wooden table. in a kitchen
A eager beaver jumping on a wooden table. in a kitchen
A eager beaver glowering on a wooden table. in a kitchen
A soup can sitting on a wooden table. in a kitchen
A soup can jumping on a wooden table. in a kitchen
A soup can glowering on a wooden table. in a kitchen

So, the strings “Beautiful flower” and “sitting” and the following terms are replaced in the original prompt. So, in this case you have 4 terms in the first term line (line 2) and 3 terms in the second term line so, 4 time 3 is 12. All 12 permutations of that prompt are outputted to a file. It will tell you the total number of permutations and throw an error when a term isn’t found.

I make no guarantees about this script, enjoy anyways.

| Posted in Programming | Comments Off on A quick script for creating prompt permutations for StableDiffusion