Opened 5 years ago

Closed 4 years ago

Last modified 4 years ago

#794 closed request (fixed)

support for htslib

Reported by: Peter Owned by: Peter
Priority: critical Milestone: yat 0.13
Component: build Version:
Keywords: Cc:

Description (last modified by Peter)

htslib has released a rc8 https://github.com/samtools/htslib/releases/tag/0.2.0-rc8 and if they keep up the momentum it looks like htslib will replace libbam. If that'll happen we should make configure aware of this and adapt so it will detect whether libbam or htslib is available.

Change History (29)

comment:1 Changed 5 years ago by Peter

Description: modified (diff)
Owner: changed from Jari Häkkinen to Peter
Summary: support for hstlibsupport for htslib

Also looking in their Makefile has an install rule (in contrast to samtools) and it installs header files under $(prefix)/include/htslib, so our magic in configure detecting include style needs to be amended with another alternative: htslib. That should have higher priority than bam and samtools i.e. we should try to include:

  • bam.h
  • htslib/bam.h
  • bam/bam.h
  • samtools/bam.h

comment:2 Changed 5 years ago by Peter

Just to make things more fun, it seems they have removed bam.h in htslib and that stuff now lives in sam.h. Well not really a problem as it can be solved with magic: look for htslib/sam.h instead and if found define YAT_BAM_HEADER to htslib/sam.h. Possibly also a good idea to change other header tests to look for sam.h (especially the plain one) in case someone choose to install headers in e.g. /usr/local/include/htslib-1 or /opt/local/foo. That however would imply that if we've found a 'sam.h' we don't know if we should include sam.h or bam.h to get declaration of bam1_t. Well, we can solve that too...

comment:3 Changed 5 years ago by Peter

Milestone: yat 0.x+yat 0.13
Priority: minorcritical
Type: discussionrequest
Version: trunk

comment:4 Changed 5 years ago by Peter

Status: newassigned

comment:5 Changed 5 years ago by Peter

(in r3350) Make configure htslib aware. Header detection changed to look for 'sam.h', 'htslib/sam.h', 'bam/sam.h', and 'samtools/sam.h'; if second alternative is detected, YAT_HAVE_HTSLIB_SAM_H is #defined. YAT_SAM_HEADER is defined appropriately; since 'bam.h' doesn't exist in htslib, YAT_BAM_HEADER is not defined (users should #include YAT_SAM_HEADER instead). Likewise linker test is modified so it tries '-lhts' before trying '-lbam'.

Note, configure now works against htslib, but compilation still fails due to API changes between bamlib and htslib. For that reason configure also checks if header 'hts.h' is available and #defined YAT_HAVE_HTSLIB, which will be used for conditional implementations.

comment:6 Changed 5 years ago by Peter

They have changed some API between samtools and htslib. For example struct bam_index_t has changed to hts_idx_t. In yat we have public funtion:

const bam_index_t* index(void) const;

and to make that compile both against samtools and htslib, I see three options:

1) PP logic around code when needed

#ifndef YAT_HAVE_HTSLIB
const bam_index_t* index(void) const;
#endif
#ifdef YAT_HAVE_HTSLIB
const hts_idx_t* index(void) const;
#endif

2) typedef new_type old_type and then only use old_type

#ifdef YAT_HAVE_HTSLIB
typedef hts_idx_t bam_index_t;
#endif
const bam_index_t* index(void) const;

3) typedef old_type new_type and then only use new_type

#ifndef YAT_HAVE_HTSLIB
typedef bam_index_t hts_idx_t;
#endif
const hts_idx_t* index(void) const;

1) is straightforward and easy to understand. The downside is that the logic has to be inserted in multiple places. For users bam_index_t is valid when compiling against samtools and hts_idx_t is only valid when compiling against htslib.

2) is probably the minimal change, but the downside is that it encourage users to use the old type (bam_index_t). For users bam_index_t is valid in both cases, and hts_idx_t is only valid when compiling against htslib.

3) is very similar to 2). It forces that we need to change all bam_index_t to hts_idx_t internally. For users bam_index_t is valid when compiling against samtools, and hts_idx_t` is valid in both cases.

I see no reason to choose 1). 3) is tempting and makes a future remove support for samtools very clean. The downside is that we break code that uses struct bam_index_t if users upgrade to htslib. But one could argue that is not really yat's responsibility that samtools/htslib folks break things. And I suppose it's possible for users to use some PP magic with YAT_HAVE_HTSLIB and solve the issue (a test case?).

comment:7 Changed 5 years ago by Peter

(In [3351]) make BamRead? compile with htslib. refs #794

comment:8 Changed 5 years ago by Peter

(In [3353]) refs #794. First version that compiles against htslib. Alibi implementations in some cases though.

comment:9 Changed 5 years ago by Peter

(In [3354]) refs #794. fix configure test so it works with htslib.

comment:10 Changed 5 years ago by Peter

(In [3355]) refs #794. implement BamRead::end() and BamRead::sequence() correctly (not using our ref impl) when having htslib.

comment:11 Changed 5 years ago by Peter

(In [3356]) refs #794. make test/pileup.cc compile with htslib

comment:12 Changed 5 years ago by Peter

(In [3357]) refs #794. implement open and close functions

comment:13 Changed 5 years ago by Peter

(In [3358]) fix how to read from a hts_itr_t. refs #794

comment:14 Changed 5 years ago by Peter

(In [3359]) fix issues in OutBamFile::write. refs #794

comment:15 Changed 5 years ago by Peter

All tests now PASS.

comment:16 Changed 5 years ago by Peter

(In [3362]) avoid temp variable. refs #794

comment:17 Changed 5 years ago by Peter

(In [3363]) improve docs. refs #794

comment:18 Changed 5 years ago by Peter

(In [3364]) describe htslib support in NEWS and README. refs #794

comment:19 Changed 5 years ago by Jari Häkkinen

Maybe a --with-htslib option to configure should be added too. I have htslib installed in a local directory structure, i.e., not in /usr nor /usr/local.

comment:20 Changed 5 years ago by Peter

There is a --with-samtools option, which perhaps should change name to --with-htslib, but it doesn't do what you want. It's only function is --with-samtools=no (or --without-samtools) in which case certain parts of yat is not compiled. We should probably change name bcs in the future there'll be no reason to support libbam 0.x.

I'm a bit hesitant to add that semantics of --with-htslib mainly because I'm trying to phase out that behaviour. It's coupling things together, location of header files and library, and it mandates the order in which flags are added, which gives limited flexibility to the user. See, for instance, ticket:467 in svndigest. I think it's better to document the usage of CPPFLAGS and LDFLAGS. But I'm not entirely against it; I can see the point being consistent with options --with-gsl and --with-boost, but it also makes the logic in configure.ac more complex.

comment:21 in reply to:  20 Changed 5 years ago by Jari Häkkinen

Replying to peter:

I'm a bit hesitant to add that semantics of --with-htslib mainly because I'm trying to phase out that behaviour. It's coupling things together, location of header files and library, and it mandates the order in which flags are added, which gives limited flexibility to the user. See, for instance, ticket:467 in svndigest. I think it's better to document the usage of CPPFLAGS and LDFLAGS. But I'm not entirely against it; I can see the point being consistent with options --with-gsl and --with-boost, but it also makes the logic in configure.ac more complex.

I use the CPPFLAGS and LDFLAGS and my request for adding --with-htslib was basically inspired by the other similar directives. I can live without it so lets forget this.

comment:22 Changed 5 years ago by Peter

(In [3398]) refs #794

Global arrays bam_nt16_stable and bam_nt16_rev_table in libbam have in htslib changed name to seq_nt16_table and seq_nt16_str.

Introduce two functions that allow access to these arrays in a uniform way regardless whether we are working against libbam or htslib.

comment:23 Changed 5 years ago by Peter

(In [3400]) refs #794

Deprecate configure option --without-samtools and add new option --without-htslib that does the same thing.

Prefer variable names with htslib rather than samtools or libbam in autotool files.

comment:24 Changed 4 years ago by Peter

Our configure.ac is a bit problematic. The tests for header sam.h and tests for bam support are independent, which means you can end up with headers from samtools and lib from libhst or vice versa.

comment:25 Changed 4 years ago by Peter

(In [3413]) Prefer variable name HAVE_SAMTOOLS_EXECUTABLE instead of HAVE_SAMTOOLS for clarity. refs #794.

comment:26 Changed 4 years ago by Peter

(In [3415]) refs #794

Change logic in configure test looking for libhts/samtools headers and libraries. Instead of looking whether both headers and libraries are available, first look for both and <htslib/hts.h> and libhts. If found use that. If not found, look if both <bam.h> and libbam are available. If yes, use them. If not abort.

Only test for function bam_calend and global variable bam_nt16_rev_table when building against libbam as they are irrelevant in htslib mode.

yat/utility/config_public.h.in: change back to macros YAT_HAVE_BAM_H YAT_HAVE_BAM_BAM_H YAT_HAVE_SAMTOOLS_BAM_H as they were called in 0.12. Add a new macro YAT_HAVE_HTSLIB_SAM_H, which is defined when builiding against htslib (just like YAT_HAVE_HTSLIB).

yat/omic/config_bam.h: use new macro names in 'yat/utility/config_public.h' and avoid having nested if statements and instead use style:

#if defined YAT_HAVE_HTSLIB_HTS_H # define YAT_SAM_HEADER "htslib/sam.h" #elif defined YAT_HAVE_BAM_H # define YAT_BAM_HEADER "bam.h" # define YAT_SAM_HEADER "sam.h" #elif defined YAT_HAVE_BAM_BAM_H # define YAT_BAM_HEADER "bam/bam.h" # define YAT_SAM_HEADER "bam/sam.h" #endif

comment:27 Changed 4 years ago by Peter

(In [3426]) #define YAT_HAVE_LIBBAM when linking against libbam. refs #794

comment:28 Changed 4 years ago by Peter

Resolution: fixed
Status: assignedclosed

(In [3429]) mention in NEWS that YAT_HAVE_LIBBAM and YAT_BAM_HEADER are not defined when building against htsl. closes #794

comment:29 Changed 4 years ago by Peter

(In [3431]) revert r3426 and some cosmetic changes (refs #794)

Note: See TracTickets for help on using tickets.