Skip to content

Conversation

Phinart98
Copy link
Collaborator

closes #22

  • Added ContributorYAMLParsingTests with full coverage for YAML fetching
  • Added UtilityFunctionTests for GitHub URL generation functions
  • Added HomeViewIntegrationTests for view integration scenarios
  • Tested error handling for network issues, YAML parsing, and invalid data
  • Added coverage dependency and testing instructions to README
  • Added placeholder structure for future package parsing tests

Copy link
Member

@lwasser lwasser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here there @Phinart98 , I am not able to successfully run the tests with UV.

uv run python manage.py test
Found 16 test(s).
Creating test database for alias 'default'...
System check identified no issues (0 silenced).
.Unexpected error fetching contributors: YAML data does not contain a list of contributors
FFailed to fetch contributors from https://raw.githubusercontent.com/pyOpenSci/pyopensci.github.io/main/_data/contributors.yml: <urlopen error Network error>
.Failed to parse YAML: YAML parse error
...Failed to get recent contributors: Fetch failed
..Unexpected error getting recent contributors: Unexpected error
........
======================================================================
FAIL: test_fetch_contributors_yaml_invalid_data_type (core.tests.ContributorYAMLParsingTests.test_fetch_contributors_yaml_invalid_data_type)
Test handling of invalid data type (not a list).
----------------------------------------------------------------------
Traceback (most recent call last):
  File "share/uv/python/cpython-3.13.0-macos-aarch64-none/lib/python3.13/unittest/mock.py", line 1423, in patched
    return func(*newargs, **newkeywargs)
  File "/pyos/pyopensci-django/core/tests.py", line 114, in test_fetch_contributors_yaml_invalid_data_type
    self.assertIn('should be a list', str(context.exception))
    ~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: 'should be a list' not found in 'Unexpected error: YAML data does not contain a list of contributors'

----------------------------------------------------------------------
Ran 16 tests in 0.025s

FAILED (failures=1)
Destroying test database for alias 'default'...

Do they run for you? Thank you for the work on this. I haven't tried to debug yet but they do fail for me specifically this test:

test_fetch_contributors_yaml_invalid_data_type

@Phinart98
Copy link
Collaborator Author

@lwasser, I was left a bit confused when I saw this because the tests passed on my computer😅. Looking at the code, line 53 of utils.py shows 'YAML data should be a list of contributors', but your error shows 'YAML data does not contain a list of contributors'.
The test expects 'should be a list', but you're getting 'does not contain'. The 'Unexpected error:' prefix suggests it's hitting the generic exception handler, meaning the original error is being caught and re-wrapped or something. The test assertion is checking for a substring match of error messages..

I haven't figured out what's wrong, but I'll look into it. If you have an idea about what could be wrong, you can draw my attention to it too.

@lwasser
Copy link
Member

lwasser commented Sep 4, 2025

@Phinart98 let me try to rerun it today. I am looking more closely at the error, and I actually see that it wasn't able to download the YAML file.

FFailed to fetch contributors from https://raw.githubusercontent.com/pyOpenSci/pyopensci.github.io/main/_data/contributors.yml: <urlopen error Network error>

I was a bit tired at the end of the day looking at those pr's. let me try again with fresh eyes.

@lwasser
Copy link
Member

lwasser commented Sep 4, 2025

@Phinart98 i just debugged the test. this is what i see.

    """
    Fetch contributor data from YAML source.
    
    Parameters
    ----------
    url : str, optional
        URL to fetch YAML from. If None, uses the default pyOpenSci GitHub URL.
        
    Returns
    -------
    list of dict
        List of contributor dictionaries.
        
    Raises
    ------
    ContributorDataError
        If data cannot be fetched or parsed.
    """
    if url is None:
        url = "https://raw.githubusercontent.com/pyOpenSci/pyopensci.github.io/main/_data/contributors.yml"
    
    try:
        with urlopen(url) as response:
            yaml_content = response.read().decode('utf-8')
            data = yaml.load(yaml_content)

        # Handle both list and mapping root
        if isinstance(data, list):
            contributors = data
        elif isinstance(data, dict):
            # Try common keys
            for key in ("contributors", "people", "members"):
                if key in data and isinstance(data[key], list):
                    contributors = data[key]
                    break
            else:
                raise ContributorDataError("YAML data does not contain a list of contributors")
        else:
            raise ContributorDataError("YAML data should be a list or a mapping containing a list of contributors")

        logger.info(f"Successfully fetched {len(contributors)} contributors from {url}")
        return contributors

    except URLError as e:
        logger.error(f"Failed to fetch contributors from {url}: {e}")
        raise ContributorDataError(f"Network error: {e}")
    except YAMLError as e:
        logger.error(f"Failed to parse YAML: {e}")
        raise ContributorDataError(f"YAML parsing error: {e}")
    except Exception as e:
        logger.error(f"Unexpected error fetching contributors: {e}")
        raise ContributorDataError(f"Unexpected error: {e}")

it hits this generic exception

    except Exception as e:
        logger.error(f"Unexpected error fetching contributors: {e}")
        raise ContributorDataError(f"Unexpected error: {e}")

and raises

YAML data does not contain a

However, your test checks for self.assertIn('should be a list', str(context.exception)), so the expected string is incorrect. I'm confused as to why the test runs for you but not me. But it should fail based on what I see is currently written.

This function is also quite complex. I wonder if there is a way to refactor so it's a bit easier to debug. But unless I'm missing something the failing test should fail as written.

@Phinart98
Copy link
Collaborator Author

That's insightful. I looked into this a bit and also found it strange that it was passing for me. I will examine it further and, if possible, simplify the function's complexity. I will tag you when I figure it out or make a change.

@Phinart98
Copy link
Collaborator Author

Phinart98 commented Sep 5, 2025

@lwasser , I looked into this a bit deeper and just found what was causing the issues. Before merging the contributor parsing PR, @melissawm asked about the data cleaning I was doing, and I removed them because I felt like that was where the conversation was headed before you went on break😅.

So now the functions in utils.py are a lot simpler than what you just showed me. To fix the error, you'll need to pull the latest version of the main branch and have the new functions

I have also appended the tests with the package data parsing ones too, so if this works for you, and everything is alright, we can proceed to merge.
I made some changes in utils.py because I copied and pasted some code from the contributor parsing logic when writing the code for the package data parsing, and forgot to edit everything to match😅

- Add ContributorYAMLParsingTests with full coverage for YAML fetching
- Add UtilityFunctionTests for GitHub URL generation functions
- Add HomeViewIntegrationTests for view integration scenarios
- Test error handling for network issues, YAML parsing, and invalid data
- Add coverage dependency and testing instructions to README
- Add placeholder structure for future package parsing tests
- Complete package parsing tests covering success/error scenarios and view integration
- Add PackageYAMLParsingTests and PackageViewIntegrationTests classes
- Fix package functions to raise PackageDataError instead of ContributorDataError
- Package error handling was copied from contributor implementation and inadvertently left unchanged
@lwasser
Copy link
Member

lwasser commented Sep 8, 2025

Ahhh - ok I'll try again! I thought I had rebased, but it's very possible I didn't as I'm bouncing between a lot of different projects!

- Remove noisy error messages that cluttered test results
- Resolves confusion between logging output and actual test failures
@Phinart98
Copy link
Collaborator Author

Phinart98 commented Sep 17, 2025

Hello @lwasser, I have looked into this a bit more, so that I can focus on checking all tests when I work on #38 . It turns out that the tests were passing like @/Yogendra said on Slack, but the outputs were just really confusing because the tests I wrote for error handling were logging error messages during execution. This made it look like some tests were failing when they were actually just testing that error scenarios work correctly. Those "Failed to fetch contributors" and "Unexpected error fetching packages" messages weren't test failures. They were logs from tests that verify our error handling works.
The earlier output you shared above actually had a real failure, but I fixed that in the 2nd commit before this one. So if you can get an output like @melissawm shared on Slack, then that will work for you now to resolve all issues.

I've just committed a fix that disables logging during tests, so now you'll see a cleaner output. Hopefully, when you check out the PR this time, you don't see a failure😅. I'll look forward to your outputs.


# Disable logging during tests for cleaner output
logging.disable(logging.CRITICAL)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a good call for a test suite.

Copy link
Member

@lwasser lwasser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's merge this @Phinart98 !! thank you for all of the work on this pr.

@lwasser lwasser changed the title Add comprehensive unit tests for contributor YAML parsing chore(tests): Add comprehensive unit tests for contributor YAML parsing Sep 17, 2025
@lwasser lwasser merged commit 8e9d31d into pyOpenSci:main Sep 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add unit tests for contributor and package YAML parsing
2 participants