Skip to content

Elasticsearch ingest attachment node crashes #91964

@zoeliterally

Description

@zoeliterally

Elasticsearch Version

Version: 8.5.2, Build: deb/a846182fa16b4ebfcc89aa3c11a11fd5adf3de04/2022-11-17T18:56:17.538630285Z, JVM: 19.0.1

Installed Plugins

No response

Java Version

bundled

OS Version

Linux ldc-data04 5.15.0-1023-azure #29~20.04.1-Ubuntu SMP Wed Oct 26 19:18:25 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

Problem Description

Hey! I wrote a small application for our company which indexes all documents on our file server with the ingest attachment processor. In 7.x (as an extra plugin) it was working fine, since the change to 8.x (buildin ingest attachment) the ingest node crashes every time.

The indexer application runs just fine, doesn't get any errors, and the cluster accepts every new document even if the ingest node has crashed. Which makes it hard to find the document that kills the node.

The service keeps running, but no new documents can get processed.

Also, the log file explodes and has over 30mb, that's a little much for around 5000 successfully indexed documents. At the end are mostly "java.security.AccessControlException: access denied ("java.lang.RuntimePermission" "getenv.*")" exceptions and the last stacktrace is also cutted.

Compressed Logfile

Steps to Reproduce

I haven't found the problematic documents yet. It always happens when i reindex the folder and after a while the node crashes.

Logs (if relevant)

[2022-11-24T09:25:23,276][INFO ][o.a.p.h.c.Chunk          ] [ldc-data04] Command of type 31 not processed!
[2022-11-24T09:25:23,290][WARN ][stderr                   ] [ldc-data04] The system environment variables are not available to Log4j due to security restrictions: java.security.AccessControlException: access denied ("java.lang.RuntimePermission" "getenv.*")
[2022-11-24T09:25:23,290][WARN ][stderr                   ] [ldc-data04] java.security.AccessControlException: access denied ("java.lang.RuntimePermission" "getenv.*")
[2022-11-24T09:25:23,291][WARN ][stderr                   ] [ldc-data04] 	at java.base/java.security.AccessControlContext.checkPermission(AccessControlContext.java:485)
[2022-11-24T09:25:23,291][WARN ][stderr                   ] [ldc-data04] 	at java.base/java.security.AccessController.checkPermission(AccessController.java:1068)
[2022-11-24T09:25:23,291][WARN ][stderr                   ] [ldc-data04] 	at java.base/java.lang.SecurityManager.checkPermission(SecurityManager.java:411)
[2022-11-24T09:25:23,291][WARN ][stderr                   ] [ldc-data04] 	at java.base/java.lang.System.getenv(System.java:1198)
[2022-11-24T09:25:23,291][WARN ][stderr                   ] [ldc-data04] 	at [email protected]/org.apache.logging.log4j.util.EnvironmentPropertySource.containsProperty(EnvironmentPropertySource.java:99)
[2022-11-24T09:25:23,291][WARN ][stderr                   ] [ldc-data04] 	at [email protected]/org.apache.logging.log4j.util.PropertiesUtil$Environment.get(PropertiesUtil.java:513)
[2022-11-24T09:25:23,291][WARN ][stderr                   ] [ldc-data04] 	at [email protected]/org.apache.logging.log4j.util.PropertiesUtil$Environment.access$200(PropertiesUtil.java:434)
[2022-11-24T09:25:23,291][WARN ][stderr                   ] [ldc-data04] 	at [email protected]/org.apache.logging.log4j.util.PropertiesUtil.getStringProperty(PropertiesUtil.java:382)
[2022-11-24T09:25:23,291][WARN ][stderr                   ] [ldc-data04] 	at [email protected]/org.apache.logging.log4j.util.PropertiesUtil.getBooleanProperty(PropertiesUtil.java:169)
[2022-11-24T09:25:23,292][WARN ][stderr                   ] [ldc-data04] 	at [email protected]/org.apache.logging.log4j.status.StatusLogger.isDebugPropertyEnabled(StatusLogger.java:143)
[2022-11-24T09:25:23,292][WARN ][stderr                   ] [ldc-data04] 	at [email protected]/org.apache.logging.log4j.status.StatusLogger.isEnabled(StatusLogger.java:426)
[2022-11-24T09:25:23,292][WARN ][stderr                   ] [ldc-data04] 	at [email protected]/org.apache.logging.log4j.status.StatusLogger.isEnabled(StatusLogger.java:354)
[2022-11-24T09:25:23,292][WARN ][stderr                   ] [ldc-data04] 	at [email protected]/org.apache.logging.log4j.spi.AbstractLogger.logIfEnabled(AbstractLogger.java:1914)
[2022-11-24T09:25:23,292][WARN ][stderr                   ] [ldc-data04] 	at [email protected]/org.apache.logging.log4j.spi.AbstractLogger.debug(AbstractLogger.java:463)
[2022-11-24T09:25:23,292][WARN ][stderr                   ] [ldc-data04] 	at [email protected]/org.apache.logging.log4j.core.appender.rolling.PatternProcessor.formatFileName(PatternProcessor.java:291)
[2022-11-24T09:25:23,292][WARN ][stderr                   ] [ldc-data04] 	at [email protected]/org.apache.logging.log4j.core.appender.rolling.PatternProcessor.formatFileName(PatternProcessor.java:278)
[2022-11-24T09:25:23,292][WARN ][stderr                   ] [ldc-data04] 	at [email protected]/org.apache.logging.log4j.core.appender.rolling.AbstractRolloverStrategy.getEligibleFiles(AbstractRolloverStrategy.java:94)
[2022-11-24T09:25:23,292][WARN ][stderr                   ] [ldc-data04] 	at [email protected]/org.apache.logging.log4j.core.appender.rolling.AbstractRolloverStrategy.getEligibleFiles(AbstractRolloverStrategy.java:87)
[2022-11-24T09:25:23,292][WARN ][stderr                   ] [ldc-data04] 	at [email protected]/org.apache.logging.log4j.core.appender.rolling.DefaultRolloverStrategy.rollover(DefaultRolloverStrategy.java:524)
[2022-11-24T09:25:23,293][WARN ][stderr                   ] [ldc-data04] 	at [email protected]/org.apache.logging.log4j.core.appender.rolling.RollingFileManager.rollover(RollingFileManager.java:504)
[2022-11-24T09:25:23,293][WARN ][stderr                   ] [ldc-data04] 	at [email protected]/org.apache.logging.log4j.core.appender.rolling.RollingFileManager.rollover(RollingFileManager.java:394)
[2022-11-24T09:25:23,293][WARN ][stderr                   ] [ldc-data04] 	at [email protected]/org.apache.logging.log4j.core.appender.rolling.RollingFileManager.checkRollover(RollingFileManager.java:308)
[2022-11-24T09:25:23,293][WARN ][stderr                   ] [ldc-data04] 	at [email protected]/org.apache.logging.log4j.core.appender.RollingFileAppender.append(RollingFileAppender.java:300)
[2022-11-24T09:25:23,293][WARN ][stderr                   ] [ldc-data04] 	at [email protected]/org.apache.logging.log4j.core.config.AppenderControl.tryCallAppender(AppenderControl.java:161)
[2022-11-24T09:25:23,293][WARN ][stderr                   ] [ldc-data04] 	at [email protected]/org.apache.logging.log4j.core.config.AppenderControl.callAppender0(AppenderControl.java:134)
[2022-11-24T09:25:23,293][WARN ][stderr                   ] [ldc-data04] 	at [email protected]/org.apache.logging.log4j.core.config.AppenderControl.callAppenderPreventRecursion(AppenderControl.java:125)
[2022-11-24T09:25:23,293][WARN ][stderr                   ] [ldc-data04] 	at [email protected]/org.apache.logging.log4j.core.config.AppenderControl.callAppender(AppenderControl.java:89)
[2022-11-24T09:25:23,293][WARN ][stderr                   ] [ldc-data04] 	at [email protected]/org.apache.logging.log4j.core.config.LoggerConfig.callAppenders(LoggerConfig.java:675)
[2022-11-24T09:25:23,293][WARN ][stderr                   ] [ldc-data04] 	at [email protected]/org.apache.logging.log4j.core.config.LoggerConfig.processLogEvent(LoggerConfig.java:633)
[2022-11-24T09:25:23,293][WARN ][stderr                   ] [ldc-data04] 	at [email protected]/org.apache.logging.log4j.core.config.LoggerConfig.log(LoggerConfig.java:616)
[2022-11-24T09:25:23,294][WARN ][stderr                   ] [ldc-data04] 	at [email protected]/org.apache.logging.log4j.core.config.LoggerConfig.log(LoggerConfig.java:552)
[2022-11-24T09:25:23,294][WARN ][stderr                   ] [ldc-data04] 	at [email protected]/org.apache.logging.log4j.core.config.AwaitCompletionReliabilityStrategy.log(AwaitCompletionReliabilityStrategy.java:82)
[2022-11-24T09:25:23,294][WARN ][stderr                   ] [ldc-data04] 	at [email protected]/org.apache.logging.log4j.core.Logger.log(Logger.java:161)
[2022-11-24T09:25:23,294][WARN ][stderr                   ] [ldc-data04] 	at [email protected]/org.apache.logging.log4j.spi.AbstractLogger.logMessage(AbstractLogger.java:2106)
[2022-11-24T09:25:23,294][WARN ][stderr                   ] [ldc-data04] 	at [email protected]/org.apache.logging.log4j.internal.DefaultLogBuilder.logMessage(DefaultLogBuilder.java:234)
[2022-11-24T09:25:23,294][WARN ][stderr                   ] [ldc-data04] 	at [email protected]/org.apache.logging.log4j.internal.DefaultLogBuilder.log(DefaultLogBuilder.java:162)
[2022-11-24T09:25:23,294][WARN ][stderr                   ] [ldc-data04] 	at org.apache.poi.hdgf.chunks.Chunk.processCommands(Chunk.java:184)
[2022-11-24T09:25:23,294][WARN ][stderr                   ] [ldc-data04] 	at org.apache.poi.hdgf.chunks.ChunkFactory.createChunk(ChunkFactory.java:207)
[2022-11-24T09:25:23,294][WARN ][stderr                   ] [ldc-data04] 	at org.apache.poi.hdgf.streams.ChunkStream.findChunks(ChunkStream.java:66)
[2022-11-24T09:25:23,294][WARN ][stderr                   ] [ldc-data04] 	at org.apache.poi.hdgf.streams.PointerContainingStream.findChildren(PointerContainingStream.java:70)
[2022-11-24T09:25:23,294][WARN ][stderr                   ] [ldc-data04] 	at org.apache.poi.hdgf.streams.PointerContainingStream.findChildren(PointerContainingStream.java:77)
[2022-11-24T09:25:23,294][WARN ][stderr                   ] [ldc-data04] 	at org.apache.poi.hdgf.streams.PointerContainingStream.findChildren(PointerContainingStream.java:77)
[2022-11-24T09:25:23,294][WARN ][stderr                   ] [ldc-data04] 	at org.apache.poi.hdgf.HDGFDiagram.<init>(HDGFDiagram.java:89)
[2022-11-24T09:25:23,294][WARN ][stderr                   ] [ldc-data04] 	at org.apache.poi.hdgf.extractor.VisioTextExtractor.<init>(VisioTextExtractor.java:52)
[2022-11-24T09:25:23,294][WARN ][stderr                   ] [ldc-data04] 	at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:232)
[2022-11-24T09:25:23,295][WARN ][stderr                   ] [ldc-data04] 	at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:175)
[2022-11-24T09:25:23,295][WARN ][stderr                   ] [ldc-data04] 	at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:298)
[2022-11-24T09:25:23,295][WARN ][stderr                   ] [ldc-data04] 	at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:180)
[2022-11-24T09:25:23,295][WARN ][stderr                   ] [ldc-data04] 	at org.apache.tika.parser.DelegatingParser.parse(DelegatingParser.java:71)
[2022-11-24T09:25:23,295][WARN ][stderr                   ] [ldc-data04] 	at org.apache.tika.extractor.ParsingEmbeddedDocumentExtractor.parseEmbedded(ParsingEmbeddedDocumentExtractor.java:108)
[2022-11-24T09:25:23,295][WARN ][stderr                   ] [ldc-data04] 	at org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor.handleEmbeddedFile(AbstractOOXMLExtractor.java:406)
[2022-11-24T09:25:23,295][WARN ][stderr                   ] [ldc-data04] 	at org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor.handleEmbeddedOLE(AbstractOOXMLExtractor.java:351)
[2022-11-24T09:25:23,295][WARN ][stderr                   ] [ldc-data04] 	at org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor.handleEmbeddedPart(AbstractOOXMLExtractor.java:267)
[2022-11-24T09:25:23,295][WARN ][stderr                   ] [ldc-data04] 	at org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor.handleEmbeddedParts(AbstractOOXMLExtractor.java:217)
[2022-11-24T09:25:23,295][WARN ][stderr                   ] [ldc-data04] 	at org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor.getXHTML(AbstractOOXMLExtractor.java:138)
[2022-11-24T09:25:23,295][WARN ][stderr                   ] [ldc-data04] 	at org.apache.tika.parser.microsoft.ooxml.OOXMLExtractorFactory.parse(OOXMLExtractorFactory.java:242)
[2022-11-24T09:25:23,295][WARN ][stderr                   ] [ldc-data04] 	at org.apache.tika.parser.microsoft.ooxml.OOXMLParser.parse(OOXMLParser.java:115)
[2022-11-24T09:25:23,295][WARN ][stderr                   ] [ldc-data04] 	at org.apache.tika.parser.ParserDecorator.parse(ParserDecorator.java:152)
[2022-11-24T09:25:23,295][WARN ][stderr                   ] [ldc-data04] 	at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:298)
[2022-11-24T09:25:23,295][WARN ][stderr                   ] [ldc-data04] 	at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:180)
[2022-11-24T09:25:23,295][WARN ][stderr                   ] [ldc-data04] 	at org.apache.tika.Tika.parseToString(Tika.java:525)
[2022-11-24T09:25:23,295][WARN ][stderr                   ] [ldc-data04] 	at org.elasticsearch.ingest.attachment.TikaImpl.lambda$parse$0(TikaImpl.java:97)
[2022-11-24T09:25:23,295][WARN ][stderr                   ] [ldc-data04] 	at java.base/java.security.AccessController.doPrivileged(AccessController.java:712)
[2022-11-24T09:25:23,295][WARN ][stderr                   ] [ldc-data04] 	at org.elasticsearch.ingest.attachment.TikaImpl.parse(TikaImpl.java:96)
[2022-11-24T09:25:23,295][WARN ][stderr                   ] [ldc-data04] 	at org.elasticsearch.ingest.attachment.AttachmentProcessor.execute(AttachmentProcessor.java:116)
[2022-11-24T09:25:23,296][WARN ][stderr                   ] [ldc-data04] 	at [email protected]/org.elasticsearch.ingest.CompoundProcessor.innerExecute(CompoundProcessor.java:174)
[2022-11-24T09:25:23,296][WARN ][stderr                   ] [ldc-data04] 	at [email protected]/org.elasticsearch.ingest.CompoundProcessor.execute(CompoundProcessor.java:152)
[2022-11-24T09:25:23,296][WARN ][stderr                   ] [ldc-data04] 	at [email protected]/org.elasticsearch.ingest.Pipeline.execute(Pipeline.java:129)
[2022-11-24T09:25:23,296][WARN ][stderr                   ] [ldc-data04] 	at [email protected]/org.elasticsearch.ingest.IngestDocument.executePipeline(IngestDocument.java:831)
[2022-11-24T09:25:23,296][WARN ][stderr                   ] [ldc-data04] 	at [email protected]/org.elasticsearch.ingest.IngestService.innerExecute(IngestService.java:895)
[2022-11-24T09:25:23,296][WARN ][stderr                   ] [ldc-data04] 	at [email protected]/org.elasticsearch.ingest.IngestService.executePipelines(IngestService.java:745)
[2022-11-24T09:25:23,296][WARN ][stderr                   ] [ldc-data04] 	at [email protected]/org.elasticsearch.ingest.IngestService$1.doRun(IngestService.java:707)
[2022-11-24T09:25:23,296][WARN ][stderr                   ] [ldc-data04] 	at [email protected]/org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:892)
[2022-11-24T09:25:23,296][WARN ][stderr                   ] [ldc-data04] 	at [email protected]/org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26)
[2022-11-24T09:25:23,296][WARN ][stderr                   ] [ldc-data04] 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
[2022-11-24T09:25:23,296][WARN ][stderr                   ] [ldc-data04] 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
[2022-11-24T09:25:23,296][WARN ][stderr                   ] [ldc-data04] 	at java.base/java.lang.Thread.run(Thread.java:1589)
[2022-11-24T09:25:23,296][WARN ][stderr                   ] [ldc-data04] The system environment variables are not available to Log4j due to security restrictions: java.security.AccessControlException: access denied ("java.lang.RuntimePermission" "getenv.*")
[2022-11-24T09:25:23,297][WARN ][stderr                   ] [ldc-data04] java.security.AccessControlException: access denied ("java.lang.RuntimePermission" "getenv.*")
[2022-11-24T09:25:23,297][WARN ][stderr                   ] [ldc-data04] 	at java.base/java.security.AccessControlContext.checkPermission(AccessControlContext.java:485)
[2022-11-24T09:25:23,297][WARN ][stderr                   ] [ldc-data04] 	at java.base/java.security.AccessController.checkPermission(AccessController.java:1068)
[2022-11-24T09:25:23,297][WARN ][stderr                   ] [ldc-data04] 	at java.base/java.lang.SecurityManager.checkPermission(SecurityManager.java:411)
[2022-11-24T09:25:23,297][WARN ][stderr                   ] [ldc-data04] 	at java.base/java.lang.System.getenv(System.java:1198)
[2022-11-24T09:25:23,297][WARN ][stderr                   ] [ldc-data04] 	at [email protected]/org.apache.logging.log4j.util.EnvironmentPropertySource.containsProperty(EnvironmentPropertySource.java:99)

Metadata

Metadata

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions