This paper alleges that iAF1260 only computes due to numerical error. Here, I will demonstrate that this is false by computing rational and integer solutions to the model.
from os.path import expanduser
from fractions import Fraction
from numpy import array, ceil, sign
import pandas
import cobra
from gurobipy import gurobi
import glpk
print "cglpk:", cobra.solvers.cglpk.__glpk_version__
print "glpk:", ".".join(str(i) for i in glpk.env.version)
print "gurobi:", ".".join(str(i) for i in gurobi.version())
print "cplex:", cobra.solvers.cplex_solver.Cplex().get_version()
cglpk: 4.55 glpk: 4.45 gurobi: 5.6.3 cplex: 12.4.0.0
This file encoding iAF1260 was downloaded from bigg
original = cobra.io.read_legacy_sbml(expanduser("~/Downloads/iAF1260.xml"))
original.id = original.description = "iAF1260"
First we solve the problem using each of the floating point solvers
solver_results = {solver: original.optimize(solver=solver).f for solver in cobra.solvers.solver_dict}
solver_results = pandas.Series(solver_results)
solver_results
cglpk 0.736701 cplex 0.736701 glpk 0.736701 gurobi 0.736701 dtype: float64
The MATLAB COBRA toolbox with GLPK also has an identical solution.
cobra.io.save_matlab_model(original, "iAF1260.mat")
%load_ext pymatbridge
Starting MATLAB on ZMQ socket ipc:///tmp/pymatbridge Send 'exit' command to kill the server .MATLAB started and connected!
%%matlab --silent -o mat_sol
addpath('~/cobratoolbox')
initCobraToolbox
load iAF1260.mat
result = optimizeCbModel(iAF1260)
mat_sol = result.f
solver_results["COBRA toolbox"] = mat_sol
mat_sol
0.7367009387595181
Now we try it again with E-solver
lp = cobra.solvers.glpk_solver.create_problem(original)
lp.write(cpxlp="iAF1260.lp")
!esolver -O iAF1260 iAF1260.lp
Using EGlib (SVN-version 2.6.18:412, built Mar 29 2012-09:24:26) Using QSopt_ex (SVN-version 2.5.10:518, built Mar 29 2012-09:29:26) Host: clostridium, Linux 3.2.0-69-generic x86_64 Current process id: 7764 Using EG-GMP mempool Reading problem from iAF1260.lp Cur rtime limit -1, trying to set to 2.14748e+09 New rtime limit 2147483647 (2.15e+09) Cur data limit -1,-1 (soft,hard) New data limit 4294967295,-1 (soft,hard) Cur address space limit -1,-1 (soft,hard) New address space limit 4294967295,-1 (soft,hard) New core dump space limit 0,-1 (soft,hard) Data Warning: Setting problem name to "unnamed". mpq_ILLlp_add_logicals ... Time for SOLVER_READ_MPQ: 0.02 seconds. ================================================================================ Trying double precision ================================================================================ starting dbl_ILLsimplex on scaled_lp... Problem has 1668 rows and 4050 cols and 10899 nonzeros starting primal phase I, nosolve 1 (0): primal infeas = 861351480.8485694 (100): primal infeas = 34111034.5800002 (200): primal infeas = 5287009.7090741 (300): primal infeas = 2592998.2718045 (400): primal infeas = 892639.6506070 (500): primal infeas = 451457.5295782 (600): primal infeas = 29546.6237273 starting primal phase II, nosolve 66 (700): primal objval = -0.0000020 (800): primal objval = -0.0000299 (900): primal objval = -0.0008766 (1000): primal objval = -0.7367155 completed dbl_ILLsimplex scaled_lp: time = 0.180, pI = 611, pII = 391, dI = 0, dII = 0, opt = -0.736701 starting dbl_ILLsimplex on dbl_problem... Problem has 1668 rows and 4050 cols and 10899 nonzeros completed dbl_ILLsimplex dbl_problem: time = 0.004, pI = 0, pII = 0, dI = 0, dII = 0, opt = -0.736701 LP Value: 0.736701, status 1 solution is infeasible for constraint zn2_c, violation -9.59801e-20, in QSexact_optimal_test (src/exact.c:553) Performing Rational Basic Solve on unnamed, RAT_optimal, check done in 0.076005 seconds, PS F 0, DS F 0, in QSexact_basis_status (src/exact.c:1087) Retesting solution, in QSexact_solver (src/exact.c:1519) Problem solved to optimality, LP value 0.736701, in QSexact_optimal_test (src/exact.c:746) ================================================================================ Problem Solved Exactly ================================================================================ Time for SOLVER: 0.28 seconds. Disabling EG-GMP mempool
The problem solves exactly, with an identical solution to the ones from the floating point solvers with a precision of $1\times10^{-10}$.
!gzip -d -f iAF1260.sol.gz
exact_result_str = !head iAF1260.sol | grep Value
exact_result = exact_result_str[0].split("=")[1].strip()
exact_result
'146888000/199386199'
pandas.DataFrame.from_dict({"values": solver_results, "error": solver_results - float(Fraction(exact_result))})
error | values | |
---|---|---|
cglpk | 2.087197e-11 | 0.736701 |
cplex | -2.066314e-11 | 0.736701 |
glpk | 1.858336e-11 | 0.736701 |
gurobi | -1.203304e-11 | 0.736701 |
COBRA toolbox | -1.053628e-10 | 0.736701 |
m = original.copy()
m.id += "_int"
# make the biomass an integerized version
biomass = m.reactions.get_by_id("Ec_biomass_iAF1260_core_59p81M")
biomass._metabolites = {met: ceil(abs(stoic)) * sign(stoic) for met, stoic in biomass.metabolites.items()}
m_array = m.to_array_based_model()
S = m_array.S.todense()
m_index, r_index = abs(S - S.astype(int)).nonzero()
# I checked, these all have 0.5 O2 in the stoichiometry
for r in r_index.tolist()[0]:
m.reactions[r] *= 2 # scaling a reaction by 2 is the same reaction, only now we have integer entries
We now cast $S$ as an integer matrix, and ensure it is equal to the floating point version.
m_array = m.to_array_based_model()
S_float = m_array.S.todense()
S = S_float.astype(int)
abs(S - S_float).max()
0.0
We increase the flux through the already open exchange reactions to compensate for increased biomass requirements and to make all the bounds integers as well. We also will round up ATPM, which fixes the flux to 8.39, a non-integer value.
for r in m.reactions.query("EX_"):
if r.lower_bound < 0:
r.lower_bound = -999999.
m.reactions.ATPM.lower_bound = ceil(m.reactions.ATPM.lower_bound)
m.reactions.ATPM.upper_bound = ceil(m.reactions.ATPM.upper_bound)
Convert the problem to MILP
for r in m.reactions:
r.variable_kind = "integer"
Using integer math, we can prove that our solution fulfills $ S \cdot v = 0$ exactly
sol = m.optimize(solver="gurobi") # gurobi has good performance for MILP's
v = array(sol.x).reshape(len(sol.x), 1).round(3).astype(int) # cast v to int type
abs(S * v).max()
0
We still satisfy $ \text{upper_bounds} \ge v \ge \text{lower_bounds} $
m_array = m.to_array_based_model()
v_array = v[:, 0]
bool(all((m_array.upper_bounds >= v_array) & (v_array >= m_array.lower_bounds)))
True
This is not because our solution is simply 0
sol.f
2256.0
Additionally, iAF1260 can produce every biomass component with an integer solution (exactly 0 error).
m2 = m.copy()
for metabolite in biomass.metabolites:
r = cobra.Reaction("DM_BIOMASS_" + metabolite.id)
r.add_metabolites({m2.metabolites.get_by_id(metabolite.id): -1})
r.variable_kind = "integer"
m2.add_reaction(r)
m_array = m2.to_array_based_model()
S_float = m_array.S.todense()
S = S_float.astype(int)
abs(S - S_float).max() # once again, this S consists only of integer values.
0.0
table = {}
for r in m2.reactions.query("DM_BIOMASS"):
m2.change_objective(r)
sol = m2.optimize(solver="gurobi")
v = array(sol.x).reshape(len(sol.x), 1).round(3).astype(int)
table[r.id] = {"error": abs(S * v).max(), "production": sol.f}
pandas.DataFrame.from_dict(table).T.min()
error 0 production 1000 dtype: float64