Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Script is broken as a result of website redesign #10

Open
FThompson opened this issue Mar 2, 2021 · 6 comments
Open

Script is broken as a result of website redesign #10

FThompson opened this issue Mar 2, 2021 · 6 comments

Comments

@FThompson
Copy link
Owner

FThompson commented Mar 2, 2021

BBC redesigned their website and it lives at a new location: https://sound-effects.bbcrewind.co.uk/

There's no longer a CSV of every file, and the files are named differently, so the script will need to be rewritten to scrape the website and download the files that way.

If the torrent stays active, I might choose not to update this script so we avoid racking up a large cloud service bill for the BBC.

@Tobe2d
Copy link

Tobe2d commented Apr 17, 2021

+1

After it finished downloading I find out that it is all 6kb ;-(

Hope to see updated script soon

@Eptiar
Copy link

Eptiar commented Apr 17, 2021

Hey. I messaged you on reddit just a few minutes ago without realising this. It'd be great to get a new code. I've tried the torrent but with no luck. Also, the new site has 33,00 or so samples now. If you ever make a new code, could you pm me on Reddit? @thatdrummerchap. Cheers! I hope everything goes well

@70hundert
Copy link

i would be interested in this too! i tried to edit the new location in the code, but without luck so far.

@meseck
Copy link

meseck commented Jan 7, 2022

It seems that all the old sound files from the csv file are available at least with this URL: "https://sound-effects-media.bbcrewind.co.uk/wav"

With my little bash script (fools-mate/random-sample) it definitely works again.

@kitchWWW
Copy link

kitchWWW commented Mar 6, 2023

For folks finding this, change the following lines for L59-61:

                filepath = Path('sounds') / folder / (filename+".zip")
                if not filepath.exists():
                    url = 'http://sound-effects-media.bbcrewind.co.uk/zip/' + row['location']+'.zip'

this changes it to have the proper new URL, and will give you a bunch of zip files instead of wav. still working on how to unzip all of these in an intelligent way.

@prh78
Copy link

prh78 commented Jun 7, 2023

For folks finding this, change the following lines for L59-61:

                filepath = Path('sounds') / folder / (filename+".zip")
                if not filepath.exists():
                    url = 'http://sound-effects-media.bbcrewind.co.uk/zip/' + row['location']+'.zip'

this changes it to have the proper new URL, and will give you a bunch of zip files instead of wav. still working on how to unzip all of these in an intelligent way.

Thanks for this, it worked. I used AI to make a script to unzip and rename the files to match archive, since the meaningful description was lost in the output file. Had to tweak it after the script missed archives with multiple periods in the filename.

import os
import zipfile

def process_archives_recursive(folder_path):
    # Iterate over all files and directories in the folder
    for root, dirs, files in os.walk(folder_path):
        for file in files:
            # Check if the file is a zip archive
            if file.endswith('.zip'):
                archive_path = os.path.join(root, file)
                rename_archive(archive_path)

def rename_archive(archive_path):
    # Extract the directory and filename from the archive path
    directory = os.path.dirname(archive_path)
    filename = os.path.basename(archive_path)

    # Split the filename and extension
    base_name, extension = os.path.splitext(filename)

    # Replace dots with hyphens in the base name
    new_base_name = base_name.replace('.', '-')

    # Create the new filename with the WAV extension
    new_filename = new_base_name + extension

    # Rename the archive file
    new_archive_path = os.path.join(directory, new_filename)
    os.rename(archive_path, new_archive_path)

    print(f"Archive renamed to: {new_archive_path}")

    # Unzip the renamed archive
    unzip_archive(new_archive_path, directory)

def unzip_archive(archive_path, directory):
    # Open the archive
    with zipfile.ZipFile(archive_path, 'r') as zip_ref:
        # Extract all files from the archive
        zip_ref.extractall(directory)

    # Get the list of extracted files
    extracted_files = zip_ref.namelist()

    if len(extracted_files) == 1:
        extracted_file = extracted_files[0]
        extracted_file_path = os.path.join(directory, extracted_file)

        # Split the extracted filename and extension
        extracted_file_name, extracted_file_ext = os.path.splitext(extracted_file)

        # Rename the extracted file to match the archive name with the WAV extension
        new_file_name = os.path.basename(archive_path).split('.')[0] + '.wav'
        new_file_path = os.path.join(directory, new_file_name)
        os.rename(extracted_file_path, new_file_path)

        print(f"File extracted and renamed to: {new_file_path}")
    else:
        print(f"The archive '{os.path.basename(archive_path)}' does not contain a single file.")

# Example usage
folder_path = '/path/to/folder'
process_archives_recursive(folder_path)

Make sure to replace '/path/to/folder' with the actual path to your folder. This merged script first renames the archives without the periods, and then proceeds to extract the renamed archives. If an archive contains a single file, it will be extracted and renamed with the WAV extension. If an archive contains multiple files, it will display a message indicating that it does not contain a single file. The new names and extraction details will be printed as output.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants