Forum

Author Topic: Network cluster setup questions  (Read 7142 times)

jrp

  • Jr. Member
  • **
  • Posts: 66
    • View Profile
Network cluster setup questions
« on: October 18, 2018, 02:00:04 PM »
I have 2 (moderately) powerful windows workstations and 2 headless Linux servers, I would like to set up a network cluster. There are some details not clear in the documentation though. Sorry for the long list of questions:

1)Can I run 2 workstations as both a processing nodes and a client nodes? To do this, do I need to run photoscan twice on each workstation (once from commandline with –node; and once from gui), or is there a way to configure photoscan to act as both node and client?

2)Computers on our site use windows domain logins, I notice many settings (including turning on GPU processing!) are stored as a per user setting, is there a simple way to configure things so that user training is minimal in how to add network processing on the workstations?

3)If running it as above, do I need 4 licences, or will the 2 licences I have be good enough?

4)I may consider buying more licences and running the 2 headless Linux boxes also as nodes (they are dual Xeon E5-2640 v3 with no graphics cards – so not ideal, but not slow either) – is running the server and a node on the same machine a problem?

5)Are there any reliability issues with running a mixed Linux windows environment? (other than the simple to fix --root file path stuff?)

6)We have a staff member with a laptop with photoscan on, can they easily connect up to the network and run as a client, then easily switch to standalone for field use?

7)Also, how much network traffic is there with the network processing? Would the system benefit from 10Gbit ethernet, or is gigabit good enough? How much does latency on the network file store affect performance? Do the nodes cache data locally? Would they benefit from SSDs?

SB

  • Newbie
  • *
  • Posts: 39
    • View Profile
Re: Network cluster setup questions
« Reply #1 on: October 25, 2018, 08:11:05 PM »
Quote
1)Can I run 2 workstations as both a processing nodes and a client nodes? To do this, do I need to run photoscan twice on each workstation (once from commandline with –node; and once from gui), or is there a way to configure photoscan to act as both node and client?

Yes, but I can't imagine why you would.  Unless you want to have the ability to use the workstation as a node during off hours.  Depending on what task the node is doing, it can be difficult to check email or watch videos while Photoscan is working. 

Quote
2)Computers on our site use windows domain logins, I notice many settings (including turning on GPU processing!) are stored as a per user setting, is there a simple way to configure things so that user training is minimal in how to add network processing on the workstations?

Create a windows shortcut with command line parameters such as
"C:\Program Files\Agisoft\PhotoScan Pro\photoscan.exe" --node --dispatch jobmanager.com  --root \\dataserver.com\SCAN_PROJECTS --cpu_enable 1 --gpu_mask 3

Quote
3)If running it as above, do I need 4 licences, or will the 2 licences I have be good enough?
two for the workstations and one for the job manager, so 3 licenses

Quote
4)I may consider buying more licences and running the 2 headless Linux boxes also as nodes (they are dual Xeon E5-2640 v3 with no graphics cards – so not ideal, but not slow either) – is running the server and a node on the same machine a problem?

Probably not a problem (the job manager hardly uses any resources) but the node process may consume so many resources that the job manager cannot operate normally.  It all depends on what the node process is doing at the time.

Slow is relative.  An older workstation with a decent GPU will process faster than CPU only in some tasks.  The Linux workstations will work fine for photo alignment and mesh creation.

Quote
5)Are there any reliability issues with running a mixed Linux windows environment? (other than the simple to fix --root file path stuff?)

No.  My experience is that the Linux cluster nodes are more reliable.  I've had Windows perform automatic reboot while Photoscan is still processing.

Quote
6)We have a staff member with a laptop with photoscan on, can they easily connect up to the network and run as a client, then easily switch to standalone for field use?

If you have a static license (node-locked) license there is nothing to do.  I think a roaming license would require you to do a license checkout before leaving the network.


Quote
7)Also, how much network traffic is there with the network processing? Would the system benefit from 10Gbit ethernet, or is gigabit good enough? How much does latency on the network file store affect performance? Do the nodes cache data locally? Would they benefit from SSDs?

Depends.  I usually have about 800GB of images on a network share.  For photo alignment, all 800GB of these images are transferred over the network to the different nodes, and maybe again during other tasks like texturing.

The nodes store data on a network share that they all access and write to.  If the project directory is 50GB (without the photos) then each node will be reading and writing a good bit from the project directory.  Latency has a huge effect during image alignment since all of the images are read fully, so having a copy of all of the images on a local drive in each node will really speed up part of the alignment.  But, since alignment might take days or a week, this speed increase may not even make a difference.  If it reads the points from the photos in 12 hours instead of 24 and then spends the next 5 days doing the alignment does it really matter that  having the photos stored locally saved 12 hours?

I think Gigabit is enough but 10G would make some difference. 

Of course, if you really want it to be as fast as possible using what you have described then you would want a 10G network and copies of all the photos on every workstation on SSD drives.  Also, the server with the network share would be using a SSD drive to hold the project data.







jrp

  • Jr. Member
  • **
  • Posts: 66
    • View Profile
Re: Network cluster setup questions
« Reply #2 on: November 07, 2018, 10:40:11 PM »
Quote
1)Can I run 2 workstations as both a processing nodes and a client nodes? To do this, do I need to run photoscan twice on each workstation (once from commandline with –node; and once from gui), or is there a way to configure photoscan to act as both node and client?

Yes, but I can't imagine why you would.  Unless you want to have the ability to use the workstation as a node during off hours.  Depending on what task the node is doing, it can be difficult to check email or watch videos while Photoscan is working. 

Thank you very much SB for your detailed response, it helped lots.

It is a university lab, we will occasionally have big jobs that need lots of resource, and ocasionally have people working separatly on student projects. This way we can disconnect the machines as nodes, send data to be processed, then reconnect them both and let it run. The aim is adaptability.


Quote
3)If running it as above, do I need 4 licences, or will the 2 licences I have be good enough?
two for the workstations and one for the job manager, so 3 licenses

I understood (from the manual) that the job manager doesn't need a licence, do you know diferent?



Quote
6)We have a staff member with a laptop with photoscan on, can they easily connect up to the network and run as a client, then easily switch to standalone for field use?

If you have a static license (node-locked) license there is nothing to do.  I think a roaming license would require you to do a license checkout before leaving the network.

This is a question about usability, and how the indevidual connects and disconnects from the network processing infrastructure. Information on how easy the interface is to use is hard to come by.

SB

  • Newbie
  • *
  • Posts: 39
    • View Profile
Re: Network cluster setup questions
« Reply #3 on: November 15, 2018, 07:41:29 PM »
If the job manager is on the same workstation as a client then it doesn't need a license, but I think it does if it is on another machine.


You can checkout a floating license onto a laptop before leaving the network for up to 30 days:

To borrow a floating license (assuming there is one available)
 Run PhotoScan software on the machine. Go to Help -> Activate Product... menu.
 Click Borrow License button in the PhotoScan Activation dialog. Set the number of days you would
like to borrow the license for and click OK button. The number of days should not exceed 30.
 Now the machine can be disconnected from the server network, with PhotoScan being kept activated
on it.

Alexey Pasumansky

  • Agisoft Technical Support
  • Hero Member
  • *****
  • Posts: 14813
    • View Profile
Re: Network cluster setup questions
« Reply #4 on: November 15, 2018, 08:52:13 PM »
Just a small comment, server instance of PhotoScan for the network processing doesn't require license activation. So it can be used on any separate machine that is not involved in the processing.
Best regards,
Alexey Pasumansky,
Agisoft LLC

SB

  • Newbie
  • *
  • Posts: 39
    • View Profile
Re: Network cluster setup questions
« Reply #5 on: November 19, 2018, 09:51:54 PM »
Thank you Alexey. 

SB

Alexey Pasumansky

  • Agisoft Technical Support
  • Hero Member
  • *****
  • Posts: 14813
    • View Profile
Re: Network cluster setup questions
« Reply #6 on: November 20, 2018, 12:29:41 PM »
We have published general steps for network processing configuring:
https://agisoft.freshdesk.com/support/solutions/articles/31000145918-how-to-configure-the-network-processing

Please feel free to comment, which information should be also included to the instruction, for example, screenshots and commands for starting server/node instances on all supported OS.
Best regards,
Alexey Pasumansky,
Agisoft LLC

SB

  • Newbie
  • *
  • Posts: 39
    • View Profile
Re: Network cluster setup questions
« Reply #7 on: November 27, 2018, 09:23:20 PM »
Hi Alexey

I think here should be more information regarding firewall configuration, ports,  and command-line options.

You can run the rlm server and job manager on any server or workstation but it needs to be accessible by all of your cluster nodes.  If the nodes can access a dedicated windows or linux server in your office or data center then I would use it.  Put the rlm server and the Photoscan job manager on the same machine since they do not use any resources. The only real requirement is that it is reliable and always up.  A linux machine would be better to use than a Windows server that does automatic updates and reboots...

Once the rlm server is running you connect to it via web browser.

http://your.hostname.edu:5054/

The service that's running is a simple program that's listening on port 5054 by default.  That means that any firewall on the server/workstation where rlm and the job manager are running needs to have the firewall configured to allow connections to the appropriate ports.  What you might want to do is allow connections from only the ip addresses of the cluster nodes and your own desktop. If anyone can connect to those ports then anyone who knows about it can setup a bunch of computers with Photoscan and use your licenses!

Configure the firewall or turn it off for testing.  Start the rlm service and try to connect from some other computer using a web browser.  Configure the rlm server with your information.


I use Windows PCs to do my project creation and such that require the graphical interface (GUI) and Windows PCs and Linux nodes to do network rendering.

There are several parameters that are very important to make this work. You should use --root to point every Photoscan instance to the same directory where the project files are located.

You use --dispatch to say where the job manager is

You use --platform offscreen to tell Photoscan not to use QT or X11 (headless mode)

A basic command that has everything you need is

./photoscan.sh --root /data/shared  --dispatch my.job-manager-hostname.edu --node --cpu_enable 1 --gpu_mask 15 --capability any --platform offscreen

If the cluster node does not have a gpu then you should leave out the gpu_mask option.


In this example, /data/shared would be a samba share or some other network share that is mounted on every node, even Windows PCs.  of course, windows PCs will not mount a share into a directory like /data/share so the --root will be different for them.

Here is a sample Windows shortcut link that starts Photoscan as a render node.  "C:\Program Files\Agisoft\PhotoScan Pro\photoscan.exe" --node --dispatch my.job-manager-hostname.edu --root \\nas22.mydomain.edu\DEPT\Data --cpu_enable 1 --gpu_mask 3


You may have trouble connecting to the rlm license server from any of the computers or nodes.  That is due to the way it searches by default.


That is whey Agisfot says this
8) As for the client side for the floating licenses, on Windows usually licenses are broadcast automatically, alternatively you can put single-line license file to PhotoScan Pro installation folder.

      The file should have .lic extension, like server.lic, for example, and should contain the following line:
            HOST FLS_address any the_port_number


For example:
          HOST 127.0.0.1 any 5053



     FLS_address - could be computer name or IP address,
      the port number -  5053 by default.
      the bold words ("HOST" and "any") shouldn't be changed.


Also, there is no need to ask anyone to install Photoscan on the cluster nodes.  You can have many copies or only one in your own home directory and run Photoscan on any cluster node that mounts your home directory. It would be a very unusual HPC if it does not have access to your home directory or files while executing your job...


The key to running Photoscan on cluster nodes is using the parameters with correct settings and having the files available in a predictable location on every node, and every node must be able to connect to the Photoscan job manager.


To start Photoscan in manager mode, use the --server option.

Windows Shortcut
"C:\Program Files\Agisoft\PhotoScan Pro\photoscan.exe" --server

or
Linux command
 ./photoscan.sh --server


The parameters
--control
--dispatch

let you change the ports used by the job manager.  I'm using 5840 and 5841.  I think it may be the default.  Anyway, the firewall has to allow connections to these ports from every node too.

That means every cluster node must be able to connect to 5840, 5841, 5053, and 5054 on your job manager / rlm server in order for the node to get a license and tasks.