JAVA: Make `TokenStreamRewriter#reduceToSingleOperationPerIndex` faster for large amount of rewrite operations #4887

snuyanzin · 2025-09-14T11:13:09Z

This PR suggests to use BitSets for tracking of ReplaceOps and InsertBeforeOps in TokenStreamRewriter#reduceToSingleOperationPerIndex
and also tries to improve the situation described at
#4886

I also made some measurements
like before PR

Benchmark                 Mode  Cnt      Score     Error   Units
MyBenchmark.tokens1      thrpt   10  25429.444 ± 422.468  ops/ms
MyBenchmark.tokens10     thrpt   10   2098.225 ±  14.184  ops/ms
MyBenchmark.tokens100    thrpt   10     26.714 ±   2.210  ops/ms
MyBenchmark.tokens1000   thrpt   10      0.254 ±   0.003  ops/ms
MyBenchmark.tokens10000  thrpt   10      0.002 ±   0.001  ops/ms

after

Benchmark                 Mode  Cnt      Score     Error   Units
MyBenchmark.tokens1      thrpt   10  23745.040 ± 218.017  ops/ms
MyBenchmark.tokens10     thrpt   10   4286.191 ±  32.264  ops/ms
MyBenchmark.tokens100    thrpt   10     84.262 ±   0.886  ops/ms
MyBenchmark.tokens1000   thrpt   10      0.929 ±   0.005  ops/ms
MyBenchmark.tokens10000  thrpt   10      0.009 ±   0.001  ops/ms

The code for benchmark

@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.MILLISECONDS)
@State(Scope.Benchmark)
@Fork(value = 2, jvmArgs = {"-Xms2G", "-Xmx2G"})
public class MyBenchmark {
	public static void main(String[] args) throws RunnerException {

		Options opt = new OptionsBuilder()
			.include(MyBenchmark.class.getSimpleName())
			.build();

		new Runner(opt).run();
	}

	@Benchmark
	@BenchmarkMode(Mode.Throughput)
	public void tokens1(Blackhole blackhole) {
		blackhole.consume(TOKENS1.getText());
	}

	@Benchmark
	@BenchmarkMode(Mode.Throughput)
	public void tokens10(Blackhole blackhole) {
		blackhole.consume(TOKENS10.getText());
	}

	@Benchmark
	@BenchmarkMode(Mode.Throughput)
	public void tokens100(Blackhole blackhole) {
		blackhole.consume(TOKENS100.getText());
	}

	@Benchmark
	@BenchmarkMode(Mode.Throughput)
	public void tokens1000(Blackhole blackhole) {
		blackhole.consume(TOKENS1000.getText());
	}

	@Benchmark
	@BenchmarkMode(Mode.Throughput)
	public void tokens10000(Blackhole blackhole) {
		blackhole.consume(TOKENS10000.getText());
	}

	private static TokenStreamRewriter TOKENS1 = getTokes(1);
	private static TokenStreamRewriter TOKENS10 = getTokes(10);
	private static TokenStreamRewriter TOKENS100 = getTokes(100);
	private static TokenStreamRewriter TOKENS1000 = getTokes(1000);
	private static TokenStreamRewriter TOKENS10000 = getTokes(10000);


	private static TokenStreamRewriter getTokes(int size) {
		LexerGrammar g;
		try {
			g = new LexerGrammar(
				"lexer grammar T;\n"+
					"A : 'a';\n" +
					"B : 'b';\n" +
					"C : 'c';\n");
		} catch (RecognitionException e) {
			throw new RuntimeException(e);
		}
		String input = "abc";
		LexerInterpreter lexEngine = g.createLexerInterpreter(new ANTLRInputStream(input));
		CommonTokenStream stream = new CommonTokenStream(lexEngine);
		stream.fill();
		TokenStreamRewriter tokens = new TokenStreamRewriter(stream);
		for (int i = 0; i < size; i++) {
			tokens.replace(0, 2, "x" + i);
			tokens.insertBefore(i, "y" + i);
		}
		return tokens;
	}
}

… large amount of rewrite operations Signed-off-by: Sergey Nuyanzin <[email protected]>

snuyanzin · 2025-09-25T13:16:23Z

@parrt , @KvanTTT sorry for the ping, may I ask you to take a look here please?

Make TokenStreamRewriter#reduceToSingleOperationPerIndex faster for…

9d82adc

… large amount of rewrite operations Signed-off-by: Sergey Nuyanzin <[email protected]>

snuyanzin changed the title ~~Make TokenStreamRewriter#reduceToSingleOperationPerIndex faster for large amount of rewrite operations~~ JAVA: Make TokenStreamRewriter#reduceToSingleOperationPerIndex faster for large amount of rewrite operations Sep 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

JAVA: Make `TokenStreamRewriter#reduceToSingleOperationPerIndex` faster for large amount of rewrite operations #4887

JAVA: Make `TokenStreamRewriter#reduceToSingleOperationPerIndex` faster for large amount of rewrite operations #4887

Uh oh!

snuyanzin commented Sep 14, 2025

Uh oh!

snuyanzin commented Sep 25, 2025

Uh oh!

Uh oh!

JAVA: Make TokenStreamRewriter#reduceToSingleOperationPerIndex faster for large amount of rewrite operations #4887

Are you sure you want to change the base?

JAVA: Make TokenStreamRewriter#reduceToSingleOperationPerIndex faster for large amount of rewrite operations #4887

Uh oh!

Conversation

snuyanzin commented Sep 14, 2025

Uh oh!

snuyanzin commented Sep 25, 2025

Uh oh!

Uh oh!

JAVA: Make `TokenStreamRewriter#reduceToSingleOperationPerIndex` faster for large amount of rewrite operations #4887

JAVA: Make `TokenStreamRewriter#reduceToSingleOperationPerIndex` faster for large amount of rewrite operations #4887