Documents and memory

Hello,

How the Bonita Engine, in 7.10.x, manage the memory when it operates with documents?

1/ Widget File Upload => Upload

During this operation, is the complete document is loaded in memory to create a temporary file, or only per packets (10000 char for example)?

2/ Submit the contract

Currently, during the file upload, an ID (temporary file name) is returned. That's means during the Submit contract, there is no memory engage, correct?

3/ in the operation, upload the document

Then, in the operation (or in the initialization of the document when this is an instantiation), the complete document is loaded in memory, to save it as a blog in the database, correct?

Is the variable is a list of documents, and users give 5 documents, then the 5 documents are loaded in the memory at a moment, correct?

4/ in a connector

When a connector (CMIS connector) is used, then the document is loaded in the memory, correct? If this is a list of 5 documents, then the 5 documents are loaded in the memory at one moment, correct?

5/ display in a form

When a File Viewer widget is used, a user clicks on a link, then the REST API load the document from the database to send it. To do that, the document is loaded in the memory, correct?

If this is a list of 5 documents, there is 5 REST API, so the 5 documents maybe load in memory at a time, but not sure.

Thank to confirm/Explain

Hi Pierre-Yves,

During this operation, is the complete document is loaded in memory to create a temporary file, or only per packets (10000 char for example)?

It used a multipart POST request that chunk the file.

Currently, during the file upload, an ID (temporary file name) is returned. That's means during the Submit contract, there is no memory engage, correct?

There is no memory client side (in your browser) linked to the file usage. However, server side, the file will be stored in database which implies memory consumption.

Then, in the operation (or in the initialization of the document when this is an instantiation), the complete document is loaded in memory, to save it as a blog in the database, correct? If the variable is a list of documents, and users give 5 documents, then the 5 documents are loaded in the memory at a moment, correct?

I don't know the implementation details engine, maybe the files are loaded sequentially, maybe not.

When a connector (CMIS connector) is used, then the document is loaded in the memory, correct? If this is a list of 5 documents, then the 5 documents are loaded in the memory at one moment, correct?

If you look at the implementation, each document content is loaded sequentially, so you don't have the 5 documents in memory at the same time.

When a File Viewer widget is used, a user clicks on a link, then the REST API load the document from the database to send it. To do that, the document is loaded in the memory, correct? If this is a list of 5 documents, there is 5 REST API, so the 5 documents maybe load in memory at a time, but not sure.

The FileViewer uses the context API, so there is no 5 Rest API call, the content of the document is not downloaded until you click on the link if preview is disabled.
When the preview is enabled and those 5 documents are lets say pdf or images, then yes, formsDocumentImage servlet is called 5 times and the pdf are all loaded in memory. That's why disabling the preview in those cases seems a good idea :)

HTH
Romain

Thank you, Romain,
I spoke on question 1 about the memory on the SERVER Side.

My customer wants to manage long and large files. So, we need to know which mechanism does not consume memory. If the file upload does not consume memory (i.e. the file is never at a time in memory, but read from the HTTP to be saved in the document per packet of XXX bytes), that's mean we can continue to use the mechanism, and then try to find a way to save the document in a GED without using the Bonita Document.

Best,

I spoke on question 1 about the memory on the SERVER Side.

Yes as long as the request is chunked, the memory usage for a file transfer will be the same server and client side. It is just a file upload.

My customer wants to manage long and large files

What size ? Be careful, Bonita is not designed to handle very large files, it is not a GED. For anything bigger than 25Mo I recommend using a GED to store the documents and use Bonita documents with URL (redirecting to the GED), this way Bonita only keeps references to documents stored in the GED.

Hello Romain,

The point is Bonita has to save the document before the GED. How can we avoid that?

Imagine that you capture a list of documents (let's say 10 documents / 80 MB each). I want to save them in a GED. But then I have to capture the document in FileInput Contract, saved them in a Bonita Document and then use a connector to send documents to the GED. So, double punition,

  • * documents are load in the memory at the instantiation, when the case is created
  • documents are saved in the Bonita database
  • documents are load again in memory to pass them to the connector
  • and even if you delete the document, they will stay in the Bonita database

What is the strategy we can propose here? I was thing to a Process Connector (which can access the contract, so the File uploaded in the temporary directory) then the Process connector save documents. In the next part, you just ignore the file Input Contract.

This strategy should works on a task isn't? Task's connector can access the Contract. So, we still have maybe the document saved in the contract table?

 

 

 

To avoid that you will have abandon the idea of using bpmn and connectors to upload the document in the GED.
The upload should be done in the form/page using the GED REST/HTTP api (if any). Then a bonita document can be used with a URL. But it means that the GED has the tools to upload file using a HTTP/REST api or you may have to develop a custom REST API Extension for that matter.

Hello, 

But if we don't want to upload directly to the GED, do you this the strategy announced should works?

There are many reason to not upload directly the GED.

1/ we don't want to open the GED to a final user, specially when the user fulfill the form is an external person of the company

2/ we don't want to expose the GED, specially when you don't have a SSO in place (else, that's mean user have to log twice, in Bonita and in the GED? So strange)

3/ we want to control the operation. If the GED failed, then the process can move to a special path. If you do that on the Front end (which can return a "status" from the GED), this is an issue for the security point of view (how can I trust a direct Rest API Call) ?

For all this reason, we, in general, prefer that all internal system are managed by connectors, never by a direct REST API on the forms; That the reason too company prefer to develop a REST API Extension on the Bonita Server then call directly the REST API on the form.

So, saying that, do you think the strategy works with Bonita? I thing so.