Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

You are writing convoluted code and hoping that your compiler will figure it out and convert it internally to the form I posted. Sometimes it does, sometimes it doesn't. In this case it generates reasonable code but doesn't vectorize it for some reason. WTF.

I prefer to just add alignment specification and move on, assuming I don't care about portability. If portability matters, reread my original post ;)



It's not convoluted. It's actually clear and well-defined making it easier to reason about.

I'd call compiler specific alignment attributes more arcane, convoluted, and susceptible to future bugs.

Vectorization isn't a panacea. You need to benchmark to be sure, lacking that I expect GCC to be better at optimizing code than you. If you disagree, please manually write a vectorized one that handles non-aligned addition and post your results :)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: