Skip to content

HTMLReporter fails when source file is encoded in UTF-8 with BOM signature #179

@nedbat

Description

@nedbat

Originally reported by pablodcar (Bitbucket: pablodcar, GitHub: pablodcar)


Hi, I'm thankful for this wonderful tool. We are using it very extensively and I hope to contribute adding new APIs and features in the future.

When a source code is encoded in UTF-8 with BOM signature, //coverage.phystokens.source_encoding// returns the correct encoding: //"utf-8-sig"//. But when the file is rendered inside the html template, using that encoding to write the report to disk, it raises a //UnicodeDecodeError//, because the BOM can not be in the middle of the final output:

  File "/home/pablo/baco-dyn/lib/python2.6/site-packages/coverage/control.py", line 603, in html_report
    reporter.report(morfs)
  File "/home/pablo/baco-dyn/lib/python2.6/site-packages/coverage/html.py", line 87, in report
    self.report_files(self.html_file, morfs, self.config.html_dir)
  File "/home/pablo/baco-dyn/lib/python2.6/site-packages/coverage/report.py", line 83, in report_files
    report_fn(cu, self.coverage._analyze(cu))
  File "/home/pablo/baco-dyn/lib/python2.6/site-packages/coverage/html.py", line 222, in html_file
    html = html.encode(encoding)
  File "/home/pablo/baco-dyn/lib/python2.6/encodings/utf_8_sig.py", line 15, in encode
    return (codecs.BOM_UTF8 + codecs.utf_8_encode(input, errors)[0], len(input))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xef in position 18296: ordinal not in range(128)

I'm attaching a patch to decode and encode the source file in advance, using UTF-8 when utf-8-sig is detected. I hope you can review it and consider adding this change.

Thanks in advance,

Pablo Carballo


Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workinghtml

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions