I got a large set of molecules from the zinc database ( http://zinc.docking.org/ ), in mol2 ( http://tripos.com/index.php?family=modules,SimplePage,,,&page=sup_mol2&s=0 ) I would like to be able to split this database into an arbitrary set of N smaller databases. What is the best script for python, bash or perl for this? I read about openbabel, but it can only generate sets of individual molecules.
If not, I can also convert mol2 to another more convenient format
Thaks
csplit can separate a file from individual molecules:
csplit
csplit ~/Download/zinc.mol2 '/@<TRIPOS>MOLECULE/' '{*}'
- , , , , .
linux:
gawk -v RS="@<TRIPOS>MOLECULE" 'NF{ print RS$0 > "zinc"++n".mol2" }' zinc.mol2
Source: https://habr.com/ru/post/1726877/More articles:Run 100+ SSIS packages in parallel with the parent package - parallel-processingHow to set cssclass plural name for ASP.NET-enabled control? - cssHow to encode this SQL query in Linq-To-SQL? - asp.net-mvcHow is HTML 5 different from HTML 4? - htmlJava Identifier Resolution Rules - javaWhy use a free interface? - c #test suite string.c - cHow to dynamically combine conditions? - c #using cabal readline package on i386 macbook (snow leopard) - haskellWhy are there two constructors in the Default AccountController provided by MVC? - constructorAll Articles