Introduction

Following up on our guide to migrating Git LFS objects to Bitbucket Cloud, this post delves into how to update repository references within your codebase post-migration. After relocating your repositories to Bitbucket Cloud, you’ll need to ensure that all internal references point to the new cloud locations. The script we introduce automates this process, modifying references in your code to reflect the new repository URLs in Bitbucket Cloud.

Sometimes transitioning to Bitbucket Cloud not only involves migrating your repositories and their associated LFS objects but also requires updating any hardcoded repository references within your project files. This step is vital for maintaining the integrity of your build and deployment pipelines, ensuring they point to the correct repository locations in the cloud.

Understanding Repository References

Repository references are often found in configuration files, build scripts, or documentation, indicating where the source code resides. These references might include SSH or HTTP URLs pointing to your old Bitbucket Server (Data Center) repositories. Updating these to match your new Bitbucket Cloud URLs is essential for seamless project operations in the cloud environment.

Automating Reference Updates with Python

This Python script automates the process of updating repository references within your codebase. It scans your migrated repositories for any old Bitbucket Server URLs and replaces them with the corresponding Bitbucket Cloud URLs.

Script Features

  • Comprehensive Search and Replace: The script searches for both SSH and HTTP URLs of your Bitbucket Server and replaces them with the new Bitbucket Cloud repository URLs.
  • Automatic Commit and Push: After updating the references, the script automatically commits these changes to your repository and pushes them to Bitbucket Cloud.

Prerequisites

Before running the script, ensure you have:

  • Completed the migration of your repositories and Git LFS objects to Bitbucket Cloud as outlined in our previous post.
  • Python and Git installed on your machine.
  • The config.py file from our previous guide, containing your Bitbucket Server and Cloud credentials and configurations.
import csv
import os
import re
import subprocess
import logging
from config import cloud, on_prem, repository_folder
# Initialize logging
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
# Configuration Variables - Customize these as needed
script_location = os.path.dirname(os.path.abspath(__file__))
input_csv = os.path.join(script_location, "merged_repositories.csv")  # Adjusted for script location
folder = os.getcwd()  # Folder containing the repositories
should_push = True  # Whether to push changes to the remote repository
def run_command(command, cwd=None):
    """Execute a system command with optional working directory."""
    try:
        subprocess.run(command, cwd=cwd, check=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
        logging.info(f"Successfully executed: {' '.join(command)}")
    except subprocess.CalledProcessError as e:
        logging.error(f"Error executing command: {' '.join(command)}\n{e.stdout.decode()}")
def replace_references_in_file(filepath, patterns):
    """Replace patterns in a file based on provided mappings."""
    if not os.path.exists(filepath):
        logging.warning(f"File not found: {filepath}")
        return
    try:
        with open(filepath, "r", encoding="utf-8") as file:
            data = file.read()
        if data: 
            for pattern, subst in patterns.items():
                data = re.sub(pattern, subst, data, 0, re.MULTILINE)
            with open(filepath, "w") as file:
                file.write(data)
            logging.info(f"Updated references in file: {filepath}")
    except Exception as e:
        logging.exception(f"Error processing file: {filepath}")
def commit_and_push_changes(repo_folder, branch="master"):
    """Commit changes in the repository and push them to the cloud."""
    try:
        # Check for uncommitted changes
        status_output = subprocess.check_output(["git", "status", "--porcelain"], cwd=repo_folder).decode().strip()
        if status_output:
            # Stage all changes
            subprocess.run(["git", "add", "."], cwd=repo_folder, check=True)
            # Commit changes
            subprocess.run(["git", "commit", "-m", "Update domain references"], cwd=repo_folder, check=True)
            # Push changes
            subprocess.run(["git", "push", "cloud", branch], cwd=repo_folder, check=True)
            logging.info(f"Changes pushed for {os.path.basename(repo_folder)}.")
        else:
            logging.info(f"No changes to commit for {os.path.basename(repo_folder)}.")
    except subprocess.CalledProcessError as e:
        logging.error(f"Error processing {os.path.basename(repo_folder)}: {e}")
def process_repository(repo_folder, patterns):
    """Process files in a repository folder to replace references and commit changes."""
    for root, _, files in os.walk(repo_folder):
        for name in files:
            filepath = os.path.join(root, name)
            replace_references_in_file(filepath, patterns)
    if should_push:
        commit_and_push_changes(repo_folder)
def main():
    patterns = {
        rf"(ssh://git@{on_prem['domain']}/)(?:.*)/(?P<repository>.*\.git)": f"git@bitbucket.org:{cloud['workspace']}/\\g<repository>",
        rf"(http?://{on_prem['domain']})(?:.*)/(?P<repository>.*\.git)": f"https://bitbucket.org/{cloud['workspace']}/\\g<repository>"
    }
    with open(input_csv, newline='') as csvfile:
        reader = csv.DictReader(csvfile, delimiter=',')
        for row in reader:
            repo_folder = os.path.join(folder, repository_folder, row['name'])
            logging.info(f"Processing repository: {row['name']}")
            process_repository(repo_folder, patterns)
if __name__ == "__main__":
    main()

Running the Script

  1. Prepare Your Environment: Place the script in the same directory as your config.py and the merged_repositories.csv file generated in the previous migration steps.
  2. Customize Patterns: The script contains patterns for identifying and replacing the repository URLs. Ensure these patterns match your Bitbucket Server’s domain and your new workspace on Bitbucket Cloud.
  3. Execute the Script: Run the script to process each repository. The script will:
    • Find and replace old repository references with the new Bitbucket Cloud URLs.
    • Commit these changes with a message “Update domain references.”
    • Push the commits to the corresponding repository on Bitbucket Cloud.

Verifying the Updates

After running the script, it’s good practice to verify that the repository references in your codebase have been correctly updated. Check a few files manually, or run your build and deployment processes to ensure they execute without issues, pointing to the correct repositories in Bitbucket Cloud.

Conclusion

Updating repository references is a crucial step in the migration process, ensuring that your build and deployment systems correctly interact with your new cloud-based repositories. This script simplifies what could otherwise be a tedious manual process, letting you focus on leveraging the benefits of Bitbucket Cloud for your development workflows.

Stay tuned for more guides and tools to streamline your migration to Bitbucket Cloud, ensuring a smooth transition for your development projects.

Additional Resources

For more information on Git or Python, consider visiting the following resources:

Remember, thorough testing in a controlled environment before applying these changes to your production repositories can help avoid unexpected disruptions.