Skip to content

Feature: improved and configurable threshold for text similarity #785

@zmbc

Description

@zmbc

Currently, it appears that if the source of two cells is not at least 70% the same, they will always be treated as separate cells and represented by a delete-cell, then add-cell operation. It feels like this threshold should be much lower, at least if whitespace-only lines are ignored.

There appears to be a relevant TODO:

# TODO: Add configuration framework
# TODO: Tune threshold with realistic sources

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions