Searchable Attachments Custom Field

Name JIRA Searchable Attachments Custom Field
Version 0.2
Product Versions JIRA 3.5+
Author(s) Philip Herbst
Homepage Searchable Attachments Custom Field
Price FREE!
License  
JavaDocs  
IssueTracking http://developer.atlassian.com/jira/browse/ATT
Download JAR SearchableAttachmentCFType-0.2.jar
Download Source http://svn.atlassian.com/svn/public/contrib/jira/jira-searchableattachment-plugin/trunk/

Description/Features

A customfield which indexes File Content and File name and the ability to search on. Also offers a column view for the Issue Navigator with links to the attachments (opened in a new browser window).

You must reindex before you can search on the field. Download Plugin dependencies or indexing will break


Indexing large PDF Documents can be very slow. pdfbox-0.7.2 seems to be significantly faster. Please use this version of pdfbox

Installation

  1. Copy follwing to WEB-INF/lib SearchableAttachmentCFType-0.1.jar
    http://www.ibiblio.org/maven/pdfbox/jars/pdfbox-0.7.1.jar
    http://www.ibiblio.org/maven/poi/jars/poi-2.0-final-20040126.jar
    http://repository.atlassian.com/tm-extractors/jars/tm-extractors-0.4.jar
  2. Restart Server
  3. Add Customfield with context of your choice (global if content should be indexed for all projects)
  4. Reindex
  5. optional: add field to your Issue Navigator

Dependencies

  1. pdfbox-0.7.2.jar
  2. poi-2.0-final-20040126.jar
  3. tm-extractors-0.4.jar

Usage

Currently supported file formats

  1. PDF
  2. MS Excel,Word,Powerpoint
  3. Plain text

Searching

The field reuses the builtin freetextsearcher so you could use the query syntax documented here http://www.atlassian.com/software/jira/docs/latest/querysyntax.html

Examples

Todo

  1. Including dependencies into Plugin, you must download them and put them into WEB-INF/lib folder
  2. Attachement Comparator(ignore case?)
  3. How to sort?
  4. impact on reindexing speed?

Version History

Version Author Notes
0.2 Philip Herbst Fixed ATT-1
0.1 Philip Herbst Initial release

Screenshots

Labels

plugin plugin Delete
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.
  1. Mar 09, 2006

    Mark Chaimungkalanont says:

    That looks awesome Philip! Will definitely give it a try. For those looking for...
    1. Mar 09, 2006

      Philip Herbst says:

      Basically Im reusing the confluence Attachment Content Extractor Plugins code. S...

      Basically Im reusing the confluence Attachment Content Extractor Plugins code. Since these plugins arent compatible with Jira I put the code into some similar classes. Perhaps I could contribute the source code to the svn repository? (Im using Eclipse+Maven)

      Cheers,

      Philip

      1. Mar 09, 2006

        Jonathan Nolen says:

        That would be great. I'll email you about setting up an account.

        That would be great. I'll email you about setting up an account.

  2. Mar 09, 2006

    Jonathan Nolen says:

    Two things: 1. Can I move this into the JIRA Extensions Plugin Library? 2. You ...

    Two things:

    1. Can I move this into the JIRA Extensions Plugin Library?
    2. You should enter this into the Codegeist Plugin competition.

    Cheers,
    Jonathan

  3. Jun 15, 2006

    Randall DeFauw says:

    Hi, I get an OutOfMemory error when I use this plug-in; it occurs while rebuild...

    Hi,

    I get an OutOfMemory error when I use this plug-in; it occurs while rebuilding the index. Without this plug-in, the index builds successfully.

    I have already allocated 1.5GB of RAM to JIRA; I don't think I can give it any more under 32-bit Windows.

    Is it possible to look at the memory usage of this plug-in? Perhaps do some of the operations in a separate process?

    Thanks,
    Randy

    1. Jun 18, 2006

      Dushan Hanuska says:

      Hi Randy, This is a third-party plugin and as such not supported by Atlassian. ...

      Hi Randy,

      This is a third-party plugin and as such not supported by Atlassian. Hopefully, you'll get a response from the author.

      My guess in your case would be that the indexing may be running into memory problems if indexing large documents. Try to limit the size of the attached documents and see if that helps.

      Regards,

      Dushan 

      1. Jun 22, 2006

        Randall DeFauw says:

        Hi Dushan,  That's more or less what I figured.  I have an alternativ...

        Hi Dushan,

         That's more or less what I figured.  I have an alternative in mind for my own plug-in to address this issue.  Is Atlassian planning on adding this feature at some point?  If so, I'll just wait for the upgrade.

         Thanks!

        Randy

        1. Jun 22, 2006

          Dushan Hanuska says:

          Hi Randy,  I am not sure what feature you are refering to. If you want to ...

          Hi Randy, 

          I am not sure what feature you are refering to. If you want to limit the size of the attachments, please refer to Configuring File Attachments page.

          I hope this helps,

          Dushan 

  4. Jul 17, 2006

    Tony Randall says:

    Does this plug-in work with Jira 3.6.2? I have downloaded the required files in...

    Does this plug-in work with Jira 3.6.2?

    I have downloaded the required files into WEBINF/lib and generated my WAR file but, I can't find a field type of Customerfield in the list of Customer Field types?

    Am I doing something wrong?

    1. Jul 18, 2006

      Philip Herbst says:

      Hi, works for me with 3.6, so should work for 3.6.2 as there are no major API ch...

      Hi,
      works for me with 3.6, so should work for 3.6.2 as there are no major API changes in point releases. Did you add the field in the administration section see http://confluence.atlassian.com/download/attachments/164784/addfield.PNG ?

      1. Jul 18, 2006

        Tony Randall says:

        Hi Philip, DOH ! I am fairly new to Jira and Tomcat and, after I deleted the ji...

        Hi Philip,

        DOH ! I am fairly new to Jira and Tomcat and, after I deleted the jira directory from the Tomcat webapps directory and restarted Tomcat... IT WORKS !

  5. Aug 08, 2006

    Neal Applebaum says:

    I tried this on my live installation but after re-indexing, (and restarting and ...

    I tried this on my live installation but after re-indexing, (and restarting and re-indexing), it found no issues. So I had to delete the custom field and re-index to get them back. Maybe it's a memory thing?

    1. Aug 08, 2006

      Philip Herbst says:

      Hi Neal, Maybe that could be the case. Some Questions Can you have a look at ...

      Hi Neal,

      Maybe that could be the case. Some Questions

      1. Can you have a look at the logs?
      2. Whats your configured maximum Attachments Size? I saw some problems with very large Attachments, perhaps a lucene Bug and JIRA uses a quite old version(1.4.3) of lucene...But that should not affect fields other than the attachment field
      3. Could you try to add the field(perhaps not as a global field, e.g. just for one dummy project) without reindexing. Then add an  Attachment to an single Issue(the single Issue will be indexed) After that try to search for its content. So you can see if field works. If it doesnt work just remove the field and trigger a Issue reindex(simple Issue update)

      Besides I will have a look at the memory Issues

      Cheers,

      Phil 

  6. Aug 08, 2006

    Neal Applebaum says:

    Philip, I added the custom field to one project, and when I attached a file to a...

    Philip, I added the custom field to one project, and when I attached a file to an issue (upon creation), I got an error:

    Could not execute action [CreateIssueDetails]:org/pdfbox/exceptions/InvalidPasswordException

    java.lang.NoClassDefFoundError: org/pdfbox/exceptions/InvalidPasswordException
     at de.phil.jira.plugin.extractors.AttachmentContentHandler.initExtractors(AttachmentContentHandler.java:37)
     at de.phil.jira.plugin.extractors.AttachmentContentHandler.(AttachmentContentHandler.java:30)
     at de.phil.jira.plugin.AttachmentSearcher.index(AttachmentSearcher.java:71)
     .... it went on further. I can attach the whole thing if you like. My max attachment size is 10MB, this file was a Word export of an issue (42 KB).

    Thanks, 

    Neal

    1. Aug 08, 2006

      Philip Herbst says:

      This error indicates that the PDFBox library is missing. You need those librarie...

      This error indicates that the PDFBox library is missing. You need those libraries mentioned at top of the page in you WEB-INF/lib folder.  Have you downloaded and put them there?

  7. Aug 09, 2006

    Neal Applebaum says:

    Philip - I did have all jar files in my WEB-INF folder. But don't worry about it...

    Philip - I did have all jar files in my WEB-INF folder. But don't worry about it .. it's not a necessity for us, just trying it out. Maybe it's running under Weblogic that makes it not work. That app server is a general pain in the app.

    1. Feb 06, 2008

      Bettina Zucker says:

      Hello, some more libraries (2 BouncyCastle libraries) are needed, in case pdf a...

      Hello,

      some more libraries (2 BouncyCastle libraries) are needed, in case pdf attachments contain passwords, even in case the passwords are not needed to open the document but only protect some minor functions (e.g. adding comments, selecting text, printing, and so on).

      For an equivalent issue about indexing attachments in confluence see:

      http://jira.atlassian.com/browse/CONF-8580

      There the 2 BouncyCastle libraries are attached inside a comment.

      After installing the plugin without these two libraries and reindexing all issues containing such a password-protected PDF file silently disappear from all navigator views.

      The issues are still there, if you know the issue key you can find them. But you cannot search them anymore. Searching just produces an error in the logfile and does not return any issue!

      Cheers

      Bettina

  8. Aug 24, 2006

    Jakob Gormsen says:

    Does the plugin filter result sets, so that matched attachments aren't shown for...

    Does the plugin filter result sets, so that matched attachments aren't shown for people who wouldn't have permission to open the attachment? (I'm worried about potential information leaks.)

  9. Aug 25, 2006

    Ben Jones says:

    Hi Phillip,            &...

    Hi Phillip,

                    Pardon my ignorance as I am new at writing plugins but where in your code do you listen for the attachment event in JIRA? I'm trying to write a plugin which listens for when an attachment is made on an issue and then executes some code.

    Regards,

                Ben

  10. Oct 11, 2006

    Gail Coulthard says:

    Hi Philip.  I'm new to JIRA.... I have successfully installed this plug in...

    Hi Philip.

     I'm new to JIRA.... I have successfully installed this plug in (re-index was successful) and I see the Searchable Attachment Custome Field in my Custom Fields but I'm stuck from there. I cannot see a new filter for searching attachements in my issue navigator..

    Any assistance is appreciated

    Thanks 

    Gail

    1. Oct 11, 2006

      Gail Coulthard says:

      Okay - I've played around and now my search for attachments are working. Is it p...

      Okay - I've played around and now my search for attachments are working. Is it possible to have the content on the document searched?

      1. Oct 11, 2006

        Philip Herbst says:

        Yes, thats possible. Supported content formats are PDF, MS Excel,Word,Powerpoint...

        Yes, thats possible. Supported content formats are PDF, MS Excel,Word,Powerpoint and plain text

  11. Mar 26, 2007

    Keertikar Pandey says:

    Hi phil, I've deployed this plugin and it looks great. Only thing is after imp...

    Hi phil,

    I've deployed this plugin and it looks great.

    Only thing is after implementing this plugin we are facing performance issue's while indexing in JiRA. I'm not sure this plugin causes the same but i suspect when i saw Randall's comment here.

    I've looked at the code and removed some SOP's (log.debug), that it output's at the time of indexing and also added the code to close the input stream explicitly inside the AbstractCommentHandler.java.

    Not too sure whether this solution will work. Any suggestions for performance ?

    1. Mar 26, 2007

      Philip Herbst says:

      Hi, If your are using Jira 3.7, there's new code checked in which reuses atlass...

      Hi,

      If your are using Jira 3.7, there's new code checked in which reuses atlassian code from confluence(from atlassian-bonnie which is included in Jira 3.7 and above). This should be more stable. For older version the source code is tagged as 0.1. In Version 0.1 theres a bug in AttachmentContentHandler in the method extractText. There's missing a finally block which closes the FileInputStream(which might explain the OutOfMemoryException). This should be not in the newer trunk version

      http://svn.atlassian.com/svn/public/contrib/jira/jira-searchableattachment-plugin/trunk/

      If you have a lot of pdf attachments it might be a good choice to use pdfbox-0.7.3 (you also need FontBox-0.x in your classpath). I made some some tests earlier and it seems to be slightly faster at extracting content.

      I'll build and upload the fixed version as soon as possible. But don't bother to fix it yourself if you like.

      Cheers

      Phil
       

      1. Mar 26, 2007

        Keertikar Pandey says:

        Yes Phil, I fixed the same bug in my code (I mentioned in my previous comment)....

        Yes Phil,

        I fixed the same bug in my code (I mentioned in my previous comment). And after that the performance looks good as an initial user response. Also, we dont have the PDF. so, currently dont need to apply the PDF box solution.

        Thanks. 

        1. Mar 26, 2007

          Philip Herbst says:

          Hi, I've attached a new version of the plugin which fixes the above mentioned b...

          Hi,

          I've attached a new version of the plugin which fixes the above mentioned bugs. Perhaps you could try the new version and tell me if it solves the performance issues that you have. SearchableAttachmentCFType-0.1.1.jar

          Cheers,

          phil

          1. Jul 30, 2007

            Vincent Eggen says:

            Please note that the FontBox-0.1.0-dev.jar is also required in the WEB_INF/lib i...

            Please note that the FontBox-0.1.0-dev.jar is also required in the WEB_INF/lib if you use PDFBox-0.7.3.jar.

            1. Jan 31, 2008

              Bettina Zucker says:

              Hello, I discovered that some more libraries (2 BouncyCastle libraries) are nee...

              Hello,

              I discovered that some more libraries (2 BouncyCastle libraries) are needed, in case pdf attachments contain passwords, even in case the passwords are not needed to open the document but only protect some minor functions (e.g. adding comments, selecting text, printing, and so on).

              For an equivalent issue about indexing attachments in confluence see:

              http://jira.atlassian.com/browse/CONF-8580

              There the 2 BouncyCastle libraries are attached inside a comment.

              The problem is, after installing the plugin without these two libraries an issue containing such a password-protected PDF file silently disappeared from all navigator views... very freaking! I just discovered by accident!

              Cheers

              Bettina

  12. Oct 21, 2007

    Matthew Janulewicz says:

    I just wrote up a bug for this, but I thought I'd post it here. Thanks! We just...

    I just wrote up a bug for this, but I thought I'd post it here. Thanks!

    We just upgraded to Jira 3.11 (from 3.9) and the indexer is getting zillions of the following error. Perhaps Atlassian re-arranged some methods/libraries/whatever and broke this plugin?:

    2007-10-21 02:02:57,844 indexerPool-1-thread-8 ERROR [jira.issue.index.MultiThreadedIssueIndexer] Exception indexing issue SD-2560
    java.lang.NoSuchMethodError: org.apache.lucene.document.Field.Text(Ljava/lang/String;Ljava/lang/String;)Lorg/apache/lucene/document/Field;
    at de.phil.jira.plugin.AttachmentSearcher.index(AttachmentSearcher.java:79)
    at com.atlassian.jira.issue.index.indexers.impl.DefaultCustomFieldIndexer.addIndex(DefaultCustomFieldIndexer.java:54)
    at com.atlassian.jira.issue.index.IssueDocument.getDocument(IssueDocument.java:34)
    at com.atlassian.jira.issue.index.IssueDocumentBuilderImpl.get(IssueDocumentBuilderImpl.java:14)
    at com.atlassian.jira.issue.index.SingleThreadedIssueIndexer$IssueAndCommentCreator.handleIssueIndexing(SingleThreadedIssueIndexer.java:404)
    at com.atlassian.jira.issue.index.MultiThreadedIssueIndexer$IssueIndexerRunnable.run(MultiThreadedIssueIndexer.java:98)
    at com.atlassian.jira.util.concurrent.BoundedExecutor$1.run(BoundedExecutor.java:39)
    at edu.emory.mathcs.backport.java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:665)
    at edu.emory.mathcs.backport.java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:690)
    at java.lang.Thread.run(Unknown Source)

    1. Oct 21, 2007

      Philip Herbst says:

      Hi Matthew, Atlassian upgraded the lucene dependencies and former deprecated me...

      Hi Matthew,

      Atlassian upgraded the lucene dependencies and former deprecated methods were removed. Luckily I have made the necessary
      changes in SVN some time ago. I have compiled the plugin against 3.11 and attached it as SearchableAttachmentCFType-0.2.jar
      Hopefully this solves your problem

      Cheers,

      Philip

  13. Feb 25, 2008

    Tal Abramson says:

    Hi i tried installing this nice custom field on a 3.10.2 installation i managed ...

    Hi
    i tried installing this nice custom field on a 3.10.2 installation
    i managed to create the custom field
    but after re indexing , i simply cant see the field in any of the screens i placed it in
    i tried all 3 versions of the jar, same results
    The logs don't show anything
    any ideas what could go wrong here?

    Tal

    1. Mar 23, 2008

      Tal Abramson says:

      So , i figured that this does not act like standart custome field So there's no ...

      So , i figured that this does not act like standart custome field
      So there's no way to create a cutome field for attachments , which will show thos attachments seperatly with correct name?

  14. Apr 21, 2008

    Matthew Janulewicz says:

    Has anyone else run this plugin with GreenHopper (http://www.greenpeppersoftware...

    Has anyone else run this plugin with GreenHopper (http://www.greenpeppersoftware.com/en/products/GreenHopper/) or other plugins that have a custom field type?

    I can't seem to run both these plugins at once. I suspect some sort of clash between what each of these plugins expects to be in the index. When I have both of these plugins enabled a full reindex yields tons of the errors shown below. When just one of the plugins is enabled I have no indexing problems whatsoever.

    2008-04-21 11:06:00,136 http-80-Processor25 INFO [jira.issue.index.DefaultIndexManager] Reindexing all issues
    [WARNING] Unknown Ptg 3c (60)
    [WARNING] Unknown Ptg 3c (60)
    [WARNING] Unknown Ptg 3c (60)
    [WARNING] Unknown Ptg 3c (60)
    [WARNING] Unknown Ptg 3c (60)
    [WARNING] Unknown Ptg 3c (60)
    [WARNING] Unknown Ptg 3c (60)
    [WARNING] Unknown Ptg 3c (60)
    [WARNING] Unknown Ptg 3c (60)
    [WARNING] Unknown Ptg 3c (60)
    java.lang.reflect.InvocationTargetException
    at sun.reflect.GeneratedConstructorAccessor92.newInstance(Unknown Source)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
    at java.lang.reflect.Constructor.newInstance(Unknown Source)
    at org.apache.poi.hssf.record.RecordFactory.createRecord(RecordFactory.java:257)
    at org.apache.poi.hssf.eventusermodel.HSSFEventFactory.genericProcessEvents(HSSFEventFactory.java:221)
    at org.apache.poi.hssf.eventusermodel.HSSFEventFactory.processEvents(HSSFEventFactory.java:139)
    at com.atlassian.bonnie.search.extractor.MsExcelContentExtractor.extractText(MsExcelContentExtractor.java:90)
    at com.atlassian.bonnie.search.extractor.BaseAttachmentContentExtractor.addFields(BaseAttachmentContentExtractor.java:31)
    at de.phil.jira.plugin.extractors.AttachmentContentHandler.extractAttachment(AttachmentContentHandler.java:60)
    at de.phil.jira.plugin.AttachmentSearcher.index(AttachmentSearcher.java:74)
    at com.atlassian.jira.issue.index.indexers.impl.DefaultCustomFieldIndexer.addIndex(DefaultCustomFieldIndexer.java:54)
    at com.atlassian.jira.issue.index.IssueDocument.getDocument(IssueDocument.java:34)
    at com.atlassian.jira.issue.index.IssueDocumentBuilderImpl.get(IssueDocumentBuilderImpl.java:14)
    at com.atlassian.jira.issue.index.SingleThreadedIssueIndexer$IssueAndCommentCreator.handleIssueIndexing(SingleThreadedIssueIndexer.java:404)
    at com.atlassian.jira.issue.index.MultiThreadedIssueIndexer$IssueIndexerRunnable.run(MultiThreadedIssueIndexer.java:98)
    at com.atlassian.jira.util.concurrent.BoundedExecutor$1.run(BoundedExecutor.java:39)
    at edu.emory.mathcs.backport.java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:665)
    at edu.emory.mathcs.backport.java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:690)
    at java.lang.Thread.run(Unknown Source)
    Caused by: java.lang.ArrayIndexOutOfBoundsException: 11
    at org.apache.poi.util.LittleEndian.getNumber(LittleEndian.java:529)
    at org.apache.poi.util.LittleEndian.getInt(LittleEndian.java:177)
    at org.apache.poi.hssf.record.BOFRecord.fillFields(BOFRecord.java:170)
    at org.apache.poi.hssf.record.Record.fillFields(Record.java:127)
    at org.apache.poi.hssf.record.Record.<init>(Record.java:92)
    at org.apache.poi.hssf.record.BOFRecord.<init>(BOFRecord.java:135)
    ... 19 more
    2008-04-21 11:42:15,820 indexerPool-1-thread-3 ERROR [bonnie.search.extractor.BaseAttachmentContentExtractor] Error indexing attachment (EIDImport_Processed.xls): Unable to construct record instance, the following exception occured: null

  15. May 23, 2008

    Gert-Jan Bartelds says:

    Hello all, I am running on JIRA 3.10.2 Enterprise (fresh install, not an upgra...

    Hello all,

    I am running on JIRA 3.10.2 Enterprise (fresh install, not an upgraded version) with Search